Re: Spark SQL with a sorted file
Michael, Thanks. Is this still turned off in the released 1.2? Is it possible to turn it on just to get an idea of how much of a difference it makes? -Jerry On 05/12/14 12:40 am, Michael Armbrust wrote: I'll add that some of our data formats will actual infer this sort of useful information automatically. Both parquet and cached inmemory tables keep statistics on the min/max value for each column. When you have predicates over these sorted columns, partitions will be eliminated if they can't possibly match the predicate given the statistics. For parquet this is new in Spark 1.2 and it is turned off by defaults (due to bugs we are working with the parquet library team to fix). Hopefully soon it will be on by default. On Wed, Dec 3, 2014 at 8:44 PM, Cheng, Hao hao.ch...@intel.com mailto:hao.ch...@intel.com wrote: You can try to write your own Relation with filter push down or use the ParquetRelation2 for workaround. (https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala) Cheng Hao -Original Message- From: Jerry Raj [mailto:jerry@gmail.com mailto:jerry@gmail.com] Sent: Thursday, December 4, 2014 11:34 AM To: user@spark.apache.org mailto:user@spark.apache.org Subject: Spark SQL with a sorted file Hi, If I create a SchemaRDD from a file that I know is sorted on a certain field, is it possible to somehow pass that information on to Spark SQL so that SQL queries referencing that field are optimized? Thanks -Jerry - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org mailto:user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org mailto:user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org mailto:user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org mailto:user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark SQL DSL for joins?
Hi, I'm using the Scala DSL for Spark SQL, but I'm not able to do joins. I have two tables (backed by Parquet files) and I need to do a join across them using a common field (user_id). This works fine using standard SQL but not using the language-integrated DSL neither t1.join(t2, on = 't1.user_id == t2.user_id) nor t1.join(t2, on = Some('t1.user_id == t2.user_id)) work, or even compile. I could not find any examples of how to perform a join using the DSL. Any pointers will be appreciated :) Thanks -Jerry - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark SQL DSL for joins?
Another problem with the DSL: t1.where('term == dmin).count() returns zero. But sqlCtx.sql(select * from t1 where term = 'dmin').count() returns 700, which I know is correct from the data. Is there something wrong with how I'm using the DSL? Thanks On 17/12/14 11:13 am, Jerry Raj wrote: Hi, I'm using the Scala DSL for Spark SQL, but I'm not able to do joins. I have two tables (backed by Parquet files) and I need to do a join across them using a common field (user_id). This works fine using standard SQL but not using the language-integrated DSL neither t1.join(t2, on = 't1.user_id == t2.user_id) nor t1.join(t2, on = Some('t1.user_id == t2.user_id)) work, or even compile. I could not find any examples of how to perform a join using the DSL. Any pointers will be appreciated :) Thanks -Jerry - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark SQL UDF returning a list?
Hi, Can a UDF return a list of values that can be used in a WHERE clause? Something like: sqlCtx.registerFunction(myudf, { Array(1, 2, 3) }) val sql = select doc_id, doc_value from doc_table where doc_id in myudf() This does not work: Exception in thread main java.lang.RuntimeException: [1.57] failure: ``('' expected but identifier myudf found I also tried returning a List of Ints, that did not work either. Is there a way to write a UDF that returns a list? Thanks -Jerry - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark SQL with a sorted file
Hi, If I create a SchemaRDD from a file that I know is sorted on a certain field, is it possible to somehow pass that information on to Spark SQL so that SQL queries referencing that field are optimized? Thanks -Jerry - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: [silk] (no subject)
On 21/06/11 11:43 AM, Venkat Mangudi wrote: On Tuesday 21 June 2011 11:24 AM, Biju Chacko wrote: Sorry to go off on a tangent, but: So not agreeing with you. Incidentally, what do you think of this: I'm unlurking because I had to say
[ug-bosug] Inconsistent File System structure- so consistently
Possibly Linux is using the Solaris partition as swap. The partition type ID 0x82 is used to denote a Solaris partition, but Linux uses the same id to denote a swap partition. You could check /etc/fstab on Linux to check what its using for swap. -Jerry Amit k. Saha wrote: Hi! Okay, so this has happened to me with SXCE, OpenSolaris, and BeleniX (with the latest builds)- I get ' Error 16 Inconsistent File System Structure' so consistently. And at all times, all I do between a working OpenSolaris installation and a non-working one, is work on Linux (which is my other OS). What is the problem? How do I get over this? Thanks, Amit -- Amit Kumar Saha http://blogs.sun.com/amitsaha/ http://amitksaha.blogspot.com Skype: amitkumarsaha ___ ug-bosug mailing list List-Unsubscribe: mailto:ug-bosug-unsubscribe at opensolaris.org List-Owner: mailto:ug-bosug-owner at opensolaris.org List-Archives: http://www.opensolaris.org/jive/forum.jspa?forumID=54
[ug-bosug] porting libntp .. AC_FUNC_MALLOC error
On 09/08/07 11:34, Anil Gulecha wrote: Hi all, I was trying to port libntp library that allows talking to Creative(and other) mp3 players.. the source consists of the libraries and a sample application. I had the initial -Wall errors that I corrected and the library has built fine. However, the sample app isnt building. bash-3.00$ pwd /tmp/libnjb-2.2.5 bash-3.00$ ls src/.libs/ base.o ioutil.olibnjb.sonjbtime.oprotocol3.o byteorder.o libnjb.alibnjb.so.5 playlist.o songid.o datafile.o libnjb.la libnjb.so.5.1.0 procedure.o unicode.o eax.olibnjb.lai njb_error.o protocol.o usb_io.o bash-3.00$ cd sample/ bash-3.00$ make /bin/bash ../libtool --tag=CC --mode=link /opt/SUNWspro/bin/cc -I../src -g -L/usr/sfw/lib -o cursesplay cursesplay.o ../src/libnjb.la -lcurses -lusb /opt/SUNWspro/bin/cc -I../src -g -o .libs/cursesplay cursesplay.o -L/usr/sfw/lib ../src/.libs/libnjb.so -lcurses -lusb -R/usr/local/lib Undefined first referenced symbol in file rpl_malloc ../src/.libs/libnjb.so ld: fatal: Symbol referencing errors. No output written to .libs/cursesplay make: *** [cursesplay] Error 1 bash-3.00$ Upon googling I found out about AC_FUNC_MALLOC macro in configure.ac that checks for correct calling of malloc, and malloc is replaced by rpl_malloc for incorrect calling. Workarounds are given at http://wiki.buici.com/wiki/Autoconf_and_RPL_MALLOC but these appear to be more hacky that a correct workaround. According to http://cygwin.com/ml/automake/2003-05/msg00043.html that error comes forward when malloc isn't correctly called.. and thus the correct thing to do in this situation is patch the source by clearing this error. My questions are: What is the ideal way to patch this error? What is the best way to patch this error for a spec file? I removed the -Wall flag from the Makefiles.. however this was said to be hacky, What is the correct way of going about correcting this error? This error should never happen on Solaris. The AC_FUNC_MALLOC autoconf macro checks whether malloc(0) returns a valid pointer, which it does on Solaris (and Linux). You could check the version of autoconf, maybe upgrading it will help. Or there could be problems with the autoconf input files for the project. -Jerry Anil ___ ug-bosug mailing list List-Unsubscribe: mailto:ug-bosug-unsubscribe at opensolaris.org List-Owner: mailto:ug-bosug-owner at opensolaris.org List-Archives: http://www.opensolaris.org/jive/forum.jspa?forumID=54 -- next part -- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3701 bytes Desc: S/MIME Cryptographic Signature URL: http://mail.opensolaris.org/pipermail/ug-bosug/attachments/20070810/c20fc2c6/attachment.bin