?????? is Branch-1.1 SBT build broken for yarn-alpha ?

2014-08-21 Thread witgo
There's a related discussion https://issues.apache.org/jira/browse/SPARK-2815 -- -- ??: Chester Chenches...@alpinenow.com; : 2014??8??21??(??) 7:42 ??: devdev@spark.apache.org; : Re: is Branch-1.1 SBT build broken for

Hang on Executor classloader lookup for the remote REPL URL classloader

2014-08-21 Thread Andrew Ash
Hi Spark devs, I'm seeing a stacktrace where the classloader that reads from the REPL is hung, and blocking all progress on that executor. Below is that hung thread's stacktrace, and also the stacktrace of another hung thread. I thought maybe there was an issue with the REPL's JVM on the other

[SNAPSHOT] Snapshot2 of Spark 1.1 has been posted

2014-08-21 Thread Patrick Wendell
Hi All, I've packaged and published a snapshot release of Spark 1.1 for testing. This is very close to RC1 and we are distributing it for testing. Please test this and report any issues on this thread. The tag of this release is v1.1.0-snapshot1 (commit e1535ad3):

Re: [SNAPSHOT] Snapshot2 of Spark 1.1 has been posted

2014-08-21 Thread Patrick Wendell
The docs for this release are also available here: http://people.apache.org/~pwendell/spark-1.1.0-snapshot2-docs/ On Thu, Aug 21, 2014 at 1:12 AM, Patrick Wendell pwend...@gmail.com wrote: Hi All, I've packaged and published a snapshot release of Spark 1.1 for testing. This is very close

Re: is Branch-1.1 SBT build broken for yarn-alpha ?

2014-08-21 Thread Sean Owen
Maven is just telling you that there is no version 1.1.0 of yarn-parent, and indeed, it has not been released. To build the branch you would need to mvn install to compile and make available local copies of artifacts along the way. (You may have these for 1.1.0-SNAPSHOT locally already). Use

Spark Contribution

2014-08-21 Thread Maisnam Ns
Hi, Can someone help me with some links on how to contribute for Spark Regards mns

Kinesis streaming integration in upcoming 1.1

2014-08-21 Thread Aniket Bhatnagar
Hi everyone I started looking at Kinesis integration and it looks promising. However, I feel like it can be improved. Here are my thoughts: 1. It assumes that AWS credentials are provided by DefaultAWSCredentialsProviderChain and there is no way to change the behavior. I would have liked to

Re: Spark SQL Query and join different data sources.

2014-08-21 Thread chutium
as far as i know, HQL queries try to find the schema info of all the tables in this query from hive metastore, so it is not possible to join tables from sqlContext using hiveContext.hql but this should work: hiveContext.hql(select ...).regAsTable(a) sqlContext.jsonFile(xxx).regAsTable(b) then

Re: is Branch-1.1 SBT build broken for yarn-alpha ?

2014-08-21 Thread Mridul Muralidharan
Weird that Patrick did not face this while creating the RC. Essentially the yarn alpha pom.xml has not been updated properly in the 1.1 branch. Just change version to '1.1.1-SNAPSHOT' for yarn/alpha/pom.xml (to make it same as any other pom). Regards, Mridul On Thu, Aug 21, 2014 at 5:09 AM,

Re: is Branch-1.1 SBT build broken for yarn-alpha ?

2014-08-21 Thread Chester @work
Do we have Jenkins tests these ? Should be pretty easy to setup just to test basic build Sent from my iPhone On Aug 21, 2014, at 6:45 AM, Mridul Muralidharan mri...@gmail.com wrote: Weird that Patrick did not face this while creating the RC. Essentially the yarn alpha pom.xml has not been

RE: Spark SQL Query and join different data sources.

2014-08-21 Thread Yan Zhou.sc
I doubt it will work as expected. Note that hiveContext.hql(select ...).regAsTable(a) will create a SchemaRDD before register the SchemaRDD with the (Hive) catalog; While sqlContext.jsonFile(xxx).regAsTable(b) will create a SchemaRDD before register the SchemaRDD with the SparkSQL

Re: Spark Contribution

2014-08-21 Thread Henry Saputra
The Apache Spark wiki on how to contribute should be great place to start: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark - Henry On Thu, Aug 21, 2014 at 3:25 AM, Maisnam Ns maisnam...@gmail.com wrote: Hi, Can someone help me with some links on how to contribute for

PARSING_ERROR from kryo

2014-08-21 Thread npanj
Hi All, I am getting PARSING_ERROR while running my job on the code checked out up to commit# db56f2df1b8027171da1b8d2571d1f2ef1e103b6. I am running this job on EC2. Any idea if there is something wrong with my config? Here is my config: -- .set(spark.executor.extraJavaOptions,

Re: Spark Contribution

2014-08-21 Thread Nicholas Chammas
We should add this link to the readme on GitHub btw. 2014년 8월 21일 목요일, Henry Saputrahenry.sapu...@gmail.com님이 작성한 메시지: The Apache Spark wiki on how to contribute should be great place to start: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark - Henry On Thu, Aug 21,

Re: Lost executor on YARN ALS iterations

2014-08-21 Thread Debasish Das
Sandy, I put spark.yarn.executor.memoryOverhead 1024 on spark-defaults.conf but I don't see environment variable on spark properties on the webui-environment Does it need to go in spark-env.sh ? Thanks. Deb On Wed, Aug 20, 2014 at 12:39 AM, Sandy Ryza sandy.r...@cloudera.com wrote: Hi

saveAsTextFile makes no progress without caching RDD

2014-08-21 Thread jerryye
Hi, Cross-posting this from users list. I'm running on branch-1.1 and trying to do a simple transformation to a relatively small dataset of 64GB and saveAsTextFile essentially hangs and tasks are stuck in running mode with the following code: // Stalls with tasks running for over an hour with

Re: saveAsTextFile to s3 on spark does not work, just hangs

2014-08-21 Thread jerryye
bump. I'm seeing the same issue with branch-1.1. Caching the RDD before running saveAsTextFile gets things running but the job stalls 2/3 of the way by using too much memory. -- View this message in context:

Storage Handlers in Spark SQL

2014-08-21 Thread Niranda Perera
Hi, I have been playing around with Spark for the past few days, and evaluating the possibility of migrating into Spark (Spark SQL) from Hive/Hadoop. I am working on the WSO2 Business Activity Monitor (WSO2 BAM, https://docs.wso2.com/display/BAM241/WSO2+Business+Activity+Monitor+Documentation )

RE: Spark SQL Query and join different data sources.

2014-08-21 Thread alexliu68
Presto is so far good at joining different sources/databases. I tried a simple join query in Spark SQL, it fails as the followings errors val a = cql(select test.a from test JOIN test1 on test.a = test1.a) a: org.apache.spark.sql.SchemaRDD = SchemaRDD[0] at RDD at SchemaRDD.scala:104 == Query