Re: Spark SQL with a sorted file

2014-12-22 Thread Jerry Raj

Michael,
Thanks. Is this still turned off in the released 1.2? Is it possible to 
turn it on just to get an idea of how much of a difference it makes?


-Jerry

On 05/12/14 12:40 am, Michael Armbrust wrote:

I'll add that some of our data formats will actual infer this sort of
useful information automatically.  Both parquet and cached inmemory
tables keep statistics on the min/max value for each column.  When you
have predicates over these sorted columns, partitions will be eliminated
if they can't possibly match the predicate given the statistics.

For parquet this is new in Spark 1.2 and it is turned off by defaults
(due to bugs we are working with the parquet library team to fix).
Hopefully soon it will be on by default.

On Wed, Dec 3, 2014 at 8:44 PM, Cheng, Hao hao.ch...@intel.com
mailto:hao.ch...@intel.com wrote:

You can try to write your own Relation with filter push down or use
the ParquetRelation2 for workaround.

(https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala)

Cheng Hao

-Original Message-
From: Jerry Raj [mailto:jerry@gmail.com
mailto:jerry@gmail.com]
Sent: Thursday, December 4, 2014 11:34 AM
To: user@spark.apache.org mailto:user@spark.apache.org
Subject: Spark SQL with a sorted file

Hi,
If I create a SchemaRDD from a file that I know is sorted on a
certain field, is it possible to somehow pass that information on to
Spark SQL so that SQL queries referencing that field are optimized?

Thanks
-Jerry

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
mailto:user-unsubscr...@spark.apache.org For additional commands,
e-mail: user-h...@spark.apache.org mailto:user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
mailto:user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
mailto:user-h...@spark.apache.org




-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark SQL DSL for joins?

2014-12-16 Thread Jerry Raj

Hi,
I'm using the Scala DSL for Spark SQL, but I'm not able to do joins. I 
have two tables (backed by Parquet files) and I need to do a join across 
them using a common field (user_id). This works fine using standard SQL 
but not using the language-integrated DSL neither


t1.join(t2, on = 't1.user_id == t2.user_id)

nor

t1.join(t2, on = Some('t1.user_id == t2.user_id))

work, or even compile. I could not find any examples of how to perform a 
join using the DSL. Any pointers will be appreciated :)


Thanks
-Jerry

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark SQL DSL for joins?

2014-12-16 Thread Jerry Raj

Another problem with the DSL:

t1.where('term == dmin).count() returns zero. But
sqlCtx.sql(select * from t1 where term = 'dmin').count() returns 700, 
which I know is correct from the data. Is there something wrong with how 
I'm using the DSL?


Thanks


On 17/12/14 11:13 am, Jerry Raj wrote:

Hi,
I'm using the Scala DSL for Spark SQL, but I'm not able to do joins. I
have two tables (backed by Parquet files) and I need to do a join across
them using a common field (user_id). This works fine using standard SQL
but not using the language-integrated DSL neither

t1.join(t2, on = 't1.user_id == t2.user_id)

nor

t1.join(t2, on = Some('t1.user_id == t2.user_id))

work, or even compile. I could not find any examples of how to perform a
join using the DSL. Any pointers will be appreciated :)

Thanks
-Jerry

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark SQL UDF returning a list?

2014-12-03 Thread Jerry Raj

Hi,
Can a UDF return a list of values that can be used in a WHERE clause? 
Something like:


sqlCtx.registerFunction(myudf, {
  Array(1, 2, 3)
})

val sql = select doc_id, doc_value from doc_table where doc_id in 
myudf()


This does not work:

Exception in thread main java.lang.RuntimeException: [1.57] failure: 
``('' expected but identifier myudf found


I also tried returning a List of Ints, that did not work either. Is 
there a way to write a UDF that returns a list?


Thanks
-Jerry

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark SQL with a sorted file

2014-12-03 Thread Jerry Raj

Hi,
If I create a SchemaRDD from a file that I know is sorted on a certain 
field, is it possible to somehow pass that information on to Spark SQL 
so that SQL queries referencing that field are optimized?


Thanks
-Jerry

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: [silk] (no subject)

2011-06-21 Thread Jerry Raj


On 21/06/11 11:43 AM, Venkat Mangudi wrote:
 On Tuesday 21 June 2011 11:24 AM, Biju Chacko wrote:
 Sorry to go off on a tangent, but:
 
 So not agreeing with you. Incidentally, what do you think of this:

I'm unlurking because I had to say



[ug-bosug] Inconsistent File System structure- so consistently

2008-08-23 Thread Jerry Raj
Possibly Linux is using the Solaris partition as swap. The partition type ID 
0x82 is used to denote a Solaris partition, but Linux uses the same id to 
denote 
a swap partition.
You could check /etc/fstab on Linux to check what its using for swap.

-Jerry

Amit k. Saha wrote:
 Hi!
 
 Okay, so this has happened to me with SXCE, OpenSolaris, and BeleniX
 (with the latest builds)- I get ' Error 16 Inconsistent File System
 Structure' so consistently. And at all times, all I do between a
 working OpenSolaris installation and a non-working one, is work on
 Linux (which is my other OS).
 
 What is the problem? How do I get over this?
 
 Thanks,
 Amit
 --
 Amit Kumar Saha
 http://blogs.sun.com/amitsaha/
 http://amitksaha.blogspot.com
 Skype: amitkumarsaha
 ___
 ug-bosug mailing list
 List-Unsubscribe: mailto:ug-bosug-unsubscribe at opensolaris.org
 List-Owner: mailto:ug-bosug-owner at opensolaris.org
 List-Archives: http://www.opensolaris.org/jive/forum.jspa?forumID=54



[ug-bosug] porting libntp .. AC_FUNC_MALLOC error

2007-08-10 Thread Jerry Raj


On 09/08/07 11:34, Anil Gulecha wrote:
 Hi all,
 
 I was trying to port libntp library that allows talking to
 Creative(and other) mp3 players.. the source consists of the libraries
 and a sample application.
 
 I had the initial -Wall errors that I corrected and the library has
 built fine. However, the sample app isnt building.
 
 bash-3.00$ pwd
 /tmp/libnjb-2.2.5
 bash-3.00$ ls src/.libs/
 base.o   ioutil.olibnjb.sonjbtime.oprotocol3.o
 byteorder.o  libnjb.alibnjb.so.5  playlist.o   songid.o
 datafile.o   libnjb.la   libnjb.so.5.1.0  procedure.o  unicode.o
 eax.olibnjb.lai  njb_error.o  protocol.o   usb_io.o
 bash-3.00$ cd sample/
 bash-3.00$ make
 /bin/bash ../libtool --tag=CC --mode=link /opt/SUNWspro/bin/cc
 -I../src -g  -L/usr/sfw/lib -o cursesplay  cursesplay.o
 ../src/libnjb.la -lcurses -lusb
 /opt/SUNWspro/bin/cc -I../src -g -o .libs/cursesplay cursesplay.o
 -L/usr/sfw/lib ../src/.libs/libnjb.so -lcurses -lusb -R/usr/local/lib
 Undefined   first referenced
  symbol in file
 rpl_malloc  ../src/.libs/libnjb.so
 ld: fatal: Symbol referencing errors. No output written to .libs/cursesplay
 make: *** [cursesplay] Error 1
 bash-3.00$
 
 
 Upon googling I found out about AC_FUNC_MALLOC macro in configure.ac
 that checks for correct calling of malloc, and malloc is replaced by
 rpl_malloc for incorrect calling.
 
 Workarounds are given at http://wiki.buici.com/wiki/Autoconf_and_RPL_MALLOC
 
 but these appear to be more hacky that a correct workaround. According
 to http://cygwin.com/ml/automake/2003-05/msg00043.html that error
 comes forward when malloc isn't correctly called.. and thus the
 correct thing to do in this situation is patch the source by clearing
 this error.
 
 My questions are:
 
 What is the ideal way to patch this error?
 What is the best way to patch this error for a spec file?
 I removed the -Wall flag from the Makefiles.. however this was said to
 be hacky, What is the correct way of going about correcting this
 error?

This error should never happen on Solaris. The AC_FUNC_MALLOC autoconf
macro checks whether malloc(0) returns a valid pointer, which it does on
Solaris (and Linux). You could check the version of autoconf, maybe
upgrading it will help. Or there could be problems with the autoconf
input files for the project.

-Jerry

 
 Anil
 ___
 ug-bosug mailing list
 List-Unsubscribe: mailto:ug-bosug-unsubscribe at opensolaris.org
 List-Owner: mailto:ug-bosug-owner at opensolaris.org
 List-Archives: http://www.opensolaris.org/jive/forum.jspa?forumID=54
-- next part --
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3701 bytes
Desc: S/MIME Cryptographic Signature
URL: 
http://mail.opensolaris.org/pipermail/ug-bosug/attachments/20070810/c20fc2c6/attachment.bin