How Spark Calculate partition size automatically

2015-01-12 Thread rajnish
Hi,

When I am running a job, that is loading the data from Cassandra, Spark has
created almost 9million partitions. How spark decide the partition count? I
have read from one of the presentation that it is good to have 1000 to
10,000 partitions.

Regards
Raj



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-Spark-Calculate-partition-size-automatically-tp21109.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Find S3 file attributes by Spark

2015-01-08 Thread rajnish
Hi,

We have file in AWS S3 bucket, that is loaded frequently, When accessing
that file from spark, can we get file properties by some method in spark? 


Regards
Raj



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Find-S3-file-attributes-by-Spark-tp21039.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Api to get the status of spark workers

2015-01-05 Thread rajnish
You can use 4040 port, that gives information for current running
application. That will give detail summary of currently running executors.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Api-to-get-the-status-of-spark-workers-tp20967p20980.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Timeout Exception in standalone cluster

2015-01-05 Thread rajnish
Hi,

I am getting following exception in Spark (1.1.0) Job that is running on
"Standalone Cluster". My cluster configuration is:

Intel(R)  2.50GHz 4 Core
16 GB RAM
5 Machines.

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException:
Unknown exception in doAs
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1134)
at
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:52)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:113)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:156)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.security.PrivilegedActionException:
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
... 4 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after
[30 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Timeout-Exception-in-standalone-cluster-tp20979.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Failed to read chunk exception

2014-12-29 Thread rajnish
I am facing the same issue in spark-1.1.0 versions

/12/29 20:44:31 INFO scheduler.TaskSetManager: Starting task 5.0 in stage
1.1 (TID 1373, X.X.X.X , ANY, 2185 bytes)
14/12/29 20:44:31 WARN scheduler.TaskSetManager: Lost task 6.0 in stage 3.0
(TID 1367, iX.X.X.X): java.io.IOException: failed to read chunk
   
org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:348)
org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:384)
   
java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2293)
   
java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2586)




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Failed-to-read-chunk-exception-tp20374p20891.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org