Re: Spark performance in cluster mode using yarn

2015-05-15 Thread Sachin Singh
Hi Ayan,
I am asking general scenarios as per given info/configuration, from
experts, not specific,
java code is nothing get hive context and select query,
there is no serialization or any other complex things I kept,straight
forward, 10 lines of code,
Group Please suggest if any Idea,

Regards
Sachin

On Fri, May 15, 2015 at 6:57 AM, ayan guha guha.a...@gmail.com wrote:

 With this information it is hard to predict. What's the performance you
 are getting? What's your desired performance? Maybe you can post your code
 and experts can suggests improvement?
 On 14 May 2015 15:02, sachin Singh sachin.sha...@gmail.com wrote:

 Hi Friends,
 please someone can give the idea, Ideally what should be time(complete job
 execution) for spark job,

 I have data in a hive table, amount of data would be 1GB , 2 lacs rows for
 whole month,
 I want to do monthly aggregation, using SQL queries,groupby

 I have only one node,1 cluster,below configuration for running job,
 --num-executors 2 --driver-memory 3g --driver-java-options
 -XX:MaxPermSize=1G --executor-memory 2g --executor-cores 2

 how much approximate time require to finish the job,

 or can someone suggest the best way to get quickly results,

 Thanks in advance,



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Spark-performance-in-cluster-mode-using-yarn-tp22877.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Spark performance in cluster mode using yarn

2015-05-13 Thread sachin Singh
Hi Friends,
please someone can give the idea, Ideally what should be time(complete job
execution) for spark job,

I have data in a hive table, amount of data would be 1GB , 2 lacs rows for
whole month,
I want to do monthly aggregation, using SQL queries,groupby

I have only one node,1 cluster,below configuration for running job,
--num-executors 2 --driver-memory 3g --driver-java-options
-XX:MaxPermSize=1G --executor-memory 2g --executor-cores 2

how much approximate time require to finish the job,

or can someone suggest the best way to get quickly results,

Thanks in advance,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-performance-in-cluster-mode-using-yarn-tp22877.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



spark yarn-cluster job failing in batch processing

2015-04-23 Thread sachin Singh
Hi All,
I am trying to execute batch processing in yarn-cluster mode i.e. I have
many sql insert queries,based on argument provided it will it will fetch the
queries ,create context , schema RDD and insert in hive tables,

Please Note- in standalone mode its working and in cluster mode working is I
configured one query,also I have configured
yarn.nodemanager.delete.debug-sec = 600

I am using below command-

spark-submit --jars
./analiticlibs/utils-common-1.0.0.jar,./analiticlibs/mysql-connector-java-5.1.17.jar,./analiticlibs/log4j-1.2.17.jar
--files datasource.properties,log4j.properties,hive-site.xml --deploy-mode
cluster --master yarn --num-executors 1 --driver-memory 2g
--driver-java-options -XX:MaxPermSize=1G --executor-memory 1g
--executor-cores 1 --class com.java.analitics.jobs.StandaloneAggregationJob
sparkanalitics-1.0.0.jar daily_agg 2015-04-21


Exception from Container log-

Exception in thread Driver java.lang.ArrayIndexOutOfBoundsException: 2
at
com.java.analitics.jobs.StandaloneAggregationJob.main(StandaloneAggregationJob.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427)

exception in our exception log file-

 diagnostics: Application application_1429800386537_0001 failed 2 times due
to AM Container for appattempt_1429800386537_0001_02 exited with 
exitCode: 15 due to: Exception from container-launch.
Container id: container_1429800386537_0001_02_01
Exit code: 15
Stack trace: ExitCodeException exitCode=15: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 15
.Failing this attempt.. Failing the application.
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: root.hdfs
 start time: 1429800525569
 final status: FAILED
 tracking URL:
http://tejas.alcatel.com:8088/cluster/app/application_1429800386537_0001
 user: hdfs
2015-04-23 20:19:27 DEBUG Client - stopping client from cache:
org.apache.hadoop.ipc.Client@12f5f40b
2015-04-23 20:19:27 DEBUG Utils - Shutdown hook called

need urgent support,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-yarn-cluster-job-failing-in-batch-processing-tp22626.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark sql failed in yarn-cluster mode when connecting to non-default hive database

2015-04-13 Thread sachin Singh
Hi Linlin,
have you got the solution for this issue, if yes then what are the thing
need to make correct,because I am also getting same error,when submitting
spark job in cluster mode getting error as under -
2015-04-14 18:16:43 DEBUG Transaction - Transaction rolled back in 0 ms
2015-04-14 18:16:43 ERROR DDLTask -
org.apache.hadoop.hive.ql.metadata.HiveException: Database does not exist:
my_database
at 
org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:4054)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:269)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java
...


Please suggest, I have copied hive-site.xml in spark/conf in standalone its
working fine.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-sql-failed-in-yarn-cluster-mode-when-connecting-to-non-default-hive-database-tp11811p22486.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



ExceptionDriver-Memory while running Spark job on Yarn-cluster

2015-04-13 Thread sachin Singh
Hi ,
When I am submitting spark job as --master yarn-cluster with below
command/options getting driver 
memory error-

spark-submit --jars
./libs/mysql-connector-java-5.1.17.jar,./libs/log4j-1.2.17.jar --files
datasource.properties,log4j.properties --master yarn-cluster --num-executors
1 --driver-memory 2g --executor-memory 512m --class
com.test.spark.jobs.AggregationJob sparkagg.jar 

Exceptions as per yarn application ID log as under -
Container: container_1428938273236_0006_01_01 on 
mycom.hostname.com_8041
=
LogType: stderr
LogLength: 128
Log Contents:
Exception in thread Driver
Exception: java.lang.OutOfMemoryError thrown from the
UncaughtExceptionHandler in thread Driver

LogType: stdout
LogLength: 40


Container: container_1428938273236_0006_02_01 on mycom.hostname.com_8041
=
LogType: stderr
LogLength: 1365
Log Contents:
java.io.IOException: Log directory
hdfs://mycom.hostname.com:8020/user/spark/applicationHistory/application_1428938273236_0006
already exists!
at
org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:129)
at org.apache.spark.util.FileLogger.start(FileLogger.scala:115)
at
org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74)
at org.apache.spark.SparkContext.init(SparkContext.scala:353)
at
org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:61)
  
LogType: stdout
LogLength: 40


please help its urgent for me,




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Exception-Driver-Memory-while-running-Spark-job-on-Yarn-cluster-tp22475.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



need info on Spark submit on yarn-cluster mode

2015-04-08 Thread sachin Singh
Hi ,
I observed that we have installed only one cluster,
and submiting job as yarn-cluster then getting below error, so is this cause
that installation is only one cluster?
Please correct me, if this is not cause then why I am not able to run in
cluster mode,
spark submit command is -
spark-submit --jars some dependent jars... --master yarn --class
com.java.jobs.sparkAggregation mytest-1.0.0.jar 

2015-04-08 19:16:50 INFO  Client - Application report for
application_1427895906171_0087 (state: FAILED)
2015-04-08 19:16:50 DEBUG Client - 
 client token: N/A
 diagnostics: Application application_1427895906171_0087 failed 2 times 
due
to AM Container for appattempt_1427895906171_0087_02 exited with 
exitCode: 15 due to: Exception from container-launch.
Container id: container_1427895906171_0087_02_01
Exit code: 15
Stack trace: ExitCodeException exitCode=15: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 15
.Failing this attempt.. Failing the application.
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: root.hdfs
 start time: 1428500770818
 final status: FAILED


Exception in thread main org.apache.spark.SparkException: Application
finished with failed status
at
org.apache.spark.deploy.yarn.ClientBase$class.run(ClientBase.scala:509)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:35)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:139)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/need-info-on-Spark-submit-on-yarn-cluster-mode-tp22420.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



issue while submitting Spark Job as --master yarn-cluster

2015-03-25 Thread sachin Singh
Hi ,
when I am submitting spark job in cluster mode getting error as under in
hadoop-yarn  log,
someone has any idea,please suggest,

2015-03-25 23:35:22,467 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
application_1427124496008_0028 State change from FINAL_SAVING to FAILED
2015-03-25 23:35:22,467 WARN
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hdfs
OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE
DESCRIPTION=App failed with state: FAILED   PERMISSIONS=Application
application_1427124496008_0028 failed 2 times due to AM Container for
appattempt_1427124496008_0028_02 exited with  exitCode: 13 due to:
Exception from container-launch.
Container id: container_1427124496008_0028_02_01
Exit code: 13
Stack trace: ExitCodeException exitCode=13: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 13
.Failing this attempt.. Failing the application.
APPID=application_1427124496008_0028



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/issue-while-submitting-Spark-Job-as-master-yarn-cluster-tp0.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: issue while submitting Spark Job as --master yarn-cluster

2015-03-25 Thread Sachin Singh
OS I am using Linux,
when I will run simply as master yarn, its running fine,

Regards
Sachin

On Wed, Mar 25, 2015 at 4:25 PM, Xi Shen davidshe...@gmail.com wrote:

 What is your environment? I remember I had similar error when running
 spark-shell --master yarn-client in Windows environment.


 On Wed, Mar 25, 2015 at 9:07 PM sachin Singh sachin.sha...@gmail.com
 wrote:

 Hi ,
 when I am submitting spark job in cluster mode getting error as under in
 hadoop-yarn  log,
 someone has any idea,please suggest,

 2015-03-25 23:35:22,467 INFO
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
 application_1427124496008_0028 State change from FINAL_SAVING to FAILED
 2015-03-25 23:35:22,467 WARN
 org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hdfs
 OPERATION=Application Finished - Failed TARGET=RMAppManager
  RESULT=FAILURE
 DESCRIPTION=App failed with state: FAILED   PERMISSIONS=Application
 application_1427124496008_0028 failed 2 times due to AM Container for
 appattempt_1427124496008_0028_02 exited with  exitCode: 13 due to:
 Exception from container-launch.
 Container id: container_1427124496008_0028_02_01
 Exit code: 13
 Stack trace: ExitCodeException exitCode=13:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
 at org.apache.hadoop.util.Shell.run(Shell.java:455)
 at
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
 at
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.
 launchContainer(DefaultContainerExecutor.java:197)
 at
 org.apache.hadoop.yarn.server.nodemanager.containermanager.
 launcher.ContainerLaunch.call(ContainerLaunch.java:299)
 at
 org.apache.hadoop.yarn.server.nodemanager.containermanager.
 launcher.ContainerLaunch.call(ContainerLaunch.java:81)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(
 ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(
 ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)


 Container exited with a non-zero exit code 13
 .Failing this attempt.. Failing the application.
 APPID=application_1427124496008_0028



 --
 View this message in context: http://apache-spark-user-list.
 1001560.n3.nabble.com/issue-while-submitting-Spark-Job-as-
 master-yarn-cluster-tp0.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: issue while creating spark context

2015-03-24 Thread Sachin Singh
thanks Sean,
please can you suggest in which file or configuration I need to modify
proper path, please elaborate which may help,

thanks,

Regards
Sachin


On Tue, Mar 24, 2015 at 7:15 PM, Sean Owen so...@cloudera.com wrote:

 That's probably the problem; the intended path is on HDFS but the
 configuration specifies a local path. See the exception message.

 On Tue, Mar 24, 2015 at 1:08 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:
  Its in your local file system, not in hdfs.
 
  Thanks
  Best Regards
 
  On Tue, Mar 24, 2015 at 6:25 PM, Sachin Singh sachin.sha...@gmail.com
  wrote:
 
  hi,
  I can see required permission is granted for this directory as under,
 
   hadoop dfs -ls /user/spark
  DEPRECATED: Use of this script to execute hdfs command is deprecated.
  Instead use the hdfs command for it.
 
  Found 1 items
  drwxrwxrwt   - spark spark  0 2015-03-20 01:04
  /user/spark/applicationHistory
 
  regards
  Sachin



Re: issue while creating spark context

2015-03-24 Thread Sachin Singh
Hi Akhil,
thanks for your quick reply,
I would like to request please elaborate i.e. what kind of permission
required ..

thanks in advance,

Regards
Sachin

On Tue, Mar 24, 2015 at 5:29 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 Its an IOException, just make sure you are having the correct permission
 over */user/spark* directory.

 Thanks
 Best Regards

 On Tue, Mar 24, 2015 at 5:21 PM, sachin Singh sachin.sha...@gmail.com
 wrote:

 hi all,
 all of sudden I getting below error when I am submitting spark job using
 master as yarn its not able to create spark context,previously working
 fine,
 I am using CDH5.3.1 and creating javaHiveContext
 spark-submit --jars

 ./analiticlibs/mysql-connector-java-5.1.17.jar,./analiticlibs/log4j-1.2.17.jar
 --master yarn --class myproject.com.java.jobs.Aggregationtask
 sparkjob-1.0.jar

 error message-
 java.io.IOException: Error in creating log directory:
 file:/user/spark/applicationHistory/application_1427194309307_0005
 at
 org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:133)
 at org.apache.spark.util.FileLogger.start(FileLogger.scala:115)
 at

 org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74)
 at org.apache.spark.SparkContext.init(SparkContext.scala:353)
 at

 org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:61)
 at

 myproject.com.java.core.SparkAnaliticEngine.getJavaSparkContext(SparkAnaliticEngine.java:77)
 at

 myproject.com.java.core.SparkAnaliticTable.evmyprojectate(SparkAnaliticTable.java:108)
 at

 myproject.com.java.core.SparkAnaliticEngine.evmyprojectateAnaliticTable(SparkAnaliticEngine.java:55)
 at

 myproject.com.java.core.SparkAnaliticEngine.evmyprojectateAnaliticTable(SparkAnaliticEngine.java:65)
 at

 myproject.com.java.jobs.CustomAggregationJob.main(CustomAggregationJob.java:184)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/issue-while-creating-spark-context-tp22196.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





Re: issue while creating spark context

2015-03-24 Thread Sachin Singh
hi,
I can see required permission is granted for this directory as under,

 hadoop dfs -ls /user/spark
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Found 1 items
*drwxrwxrwt   - spark spark  0 2015-03-20 01:04
/user/spark/applicationHistory*

regards
Sachin


On Tue, Mar 24, 2015 at 6:13 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:

 write permission as its clearly saying:

 java.io.IOException:* Error in creating log directory:*
 file:*/user/spark/*applicationHistory/application_1427194309307_0005

 Thanks
 Best Regards

 On Tue, Mar 24, 2015 at 6:08 PM, Sachin Singh sachin.sha...@gmail.com
 wrote:

 Hi Akhil,
 thanks for your quick reply,
 I would like to request please elaborate i.e. what kind of permission
 required ..

 thanks in advance,

 Regards
 Sachin

 On Tue, Mar 24, 2015 at 5:29 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:

 Its an IOException, just make sure you are having the correct permission
 over */user/spark* directory.

 Thanks
 Best Regards

 On Tue, Mar 24, 2015 at 5:21 PM, sachin Singh sachin.sha...@gmail.com
 wrote:

 hi all,
 all of sudden I getting below error when I am submitting spark job using
 master as yarn its not able to create spark context,previously working
 fine,
 I am using CDH5.3.1 and creating javaHiveContext
 spark-submit --jars

 ./analiticlibs/mysql-connector-java-5.1.17.jar,./analiticlibs/log4j-1.2.17.jar
 --master yarn --class myproject.com.java.jobs.Aggregationtask
 sparkjob-1.0.jar

 error message-
 java.io.IOException: Error in creating log directory:
 file:/user/spark/applicationHistory/application_1427194309307_0005
 at
 org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:133)
 at org.apache.spark.util.FileLogger.start(FileLogger.scala:115)
 at

 org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74)
 at org.apache.spark.SparkContext.init(SparkContext.scala:353)
 at

 org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:61)
 at

 myproject.com.java.core.SparkAnaliticEngine.getJavaSparkContext(SparkAnaliticEngine.java:77)
 at

 myproject.com.java.core.SparkAnaliticTable.evmyprojectate(SparkAnaliticTable.java:108)
 at

 myproject.com.java.core.SparkAnaliticEngine.evmyprojectateAnaliticTable(SparkAnaliticEngine.java:55)
 at

 myproject.com.java.core.SparkAnaliticEngine.evmyprojectateAnaliticTable(SparkAnaliticEngine.java:65)
 at

 myproject.com.java.jobs.CustomAggregationJob.main(CustomAggregationJob.java:184)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
 at
 org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/issue-while-creating-spark-context-tp22196.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org







issue while creating spark context

2015-03-24 Thread sachin Singh
hi all,
all of sudden I getting below error when I am submitting spark job using
master as yarn its not able to create spark context,previously working fine,
I am using CDH5.3.1 and creating javaHiveContext
spark-submit --jars
./analiticlibs/mysql-connector-java-5.1.17.jar,./analiticlibs/log4j-1.2.17.jar 
--master yarn --class myproject.com.java.jobs.Aggregationtask
sparkjob-1.0.jar

error message-
java.io.IOException: Error in creating log directory:
file:/user/spark/applicationHistory/application_1427194309307_0005
at
org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:133)
at org.apache.spark.util.FileLogger.start(FileLogger.scala:115)
at
org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74)
at org.apache.spark.SparkContext.init(SparkContext.scala:353)
at
org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:61)
at
myproject.com.java.core.SparkAnaliticEngine.getJavaSparkContext(SparkAnaliticEngine.java:77)
at
myproject.com.java.core.SparkAnaliticTable.evmyprojectate(SparkAnaliticTable.java:108)
at
myproject.com.java.core.SparkAnaliticEngine.evmyprojectateAnaliticTable(SparkAnaliticEngine.java:55)
at
myproject.com.java.core.SparkAnaliticEngine.evmyprojectateAnaliticTable(SparkAnaliticEngine.java:65)
at
myproject.com.java.jobs.CustomAggregationJob.main(CustomAggregationJob.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/issue-while-creating-spark-context-tp22196.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: issue while creating spark context

2015-03-24 Thread Sachin Singh
thanks Sean and Akhil,
I changed the the permission of  */user/spark/applicationHistory, *now it
works,


On Tue, Mar 24, 2015 at 7:35 PM, Sachin Singh sachin.sha...@gmail.com
wrote:

 thanks Sean,
 please can you suggest in which file or configuration I need to modify
 proper path, please elaborate which may help,

 thanks,

 Regards
 Sachin


 On Tue, Mar 24, 2015 at 7:15 PM, Sean Owen so...@cloudera.com wrote:

 That's probably the problem; the intended path is on HDFS but the
 configuration specifies a local path. See the exception message.

 On Tue, Mar 24, 2015 at 1:08 PM, Akhil Das ak...@sigmoidanalytics.com
 wrote:
  Its in your local file system, not in hdfs.
 
  Thanks
  Best Regards
 
  On Tue, Mar 24, 2015 at 6:25 PM, Sachin Singh sachin.sha...@gmail.com
  wrote:
 
  hi,
  I can see required permission is granted for this directory as under,
 
   hadoop dfs -ls /user/spark
  DEPRECATED: Use of this script to execute hdfs command is deprecated.
  Instead use the hdfs command for it.
 
  Found 1 items
  drwxrwxrwt   - spark spark  0 2015-03-20 01:04
  /user/spark/applicationHistory
 
  regards
  Sachin





issue creating spark context with CDH 5.3.1

2015-03-09 Thread sachin Singh
Hi,
I am using CDH5.3.1
I am getting bellow error while, even spark context not getting created,
I am submitting my job like this -
submitting command-
 spark-submit --jars
./analiticlibs/utils-common-1.0.0.jar,./analiticlibs/mysql-connector-java-5.1.17.jar,./analiticlibs/log4j-1.2.17.jar,./analiticlibs/ant-launcher-1.9.1.jar,./analiticlibs/antlr-2.7.7.jar,./analiticlibs/antlr-runtime-3.4.jar,./analiticlibs/avro-1.7.6-cdh5.3.1.jar,./analiticlibs/datanucleus-api-jdo-3.2.6.jar,./analiticlibs/datanucleus-core-3.2.10.jar,./analiticlibs/datanucleus-rdbms-3.2.9.jar,./analiticlibs/derby-10.10.1.1.jar,./analiticlibs/hive-ant-0.13.1-cdh5.3.1.jar,./analiticlibs/hive-contrib-0.13.1-cdh5.3.1.jar,./analiticlibs/hive-exec-0.13.1-cdh5.3.1.jar,./analiticlibs/hive-jdbc-0.13.1-cdh5.3.1.jar,./analiticlibs/hive-metastore-0.13.1-cdh5.3.1.jar,./analiticlibs/hive-service-0.13.1-cdh5.3.1.jar,./analiticlibs/libfb303-0.9.0.jar,./analiticlibs/libthrift-0.9.0-cdh5-2.jar,./analiticlibs/tachyon-0.5.0.jar,./analiticlibs/zookeeper.jar
 
--master yarn --class mycom.java.analitics.SparkEngineTest
sparkanalitics-1.0.0.jar

even if I will not specify jar explicitly I am getting same exception,

exception-

Exception in thread main java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/conf/HiveConf
at
org.apache.spark.sql.hive.api.java.JavaHiveContext.init(JavaHiveContext.scala:30)
at
mycom.java.analitics.core.SparkAnaliticEngine.getJavaHiveContext(SparkAnaliticEngine.java:103)
at
mycom.java.analitics.core.SparkAnaliticTable.evmycomate(SparkAnaliticTable.java:106)
at
mycom.java.analitics.core.SparkAnaliticEngine.evmycomateAnaliticTable(SparkAnaliticEngine.java:55)
at
mycom.java.analitics.core.SparkAnaliticEngine.evmycomateAnaliticTable(SparkAnaliticEngine.java:65)
at
mycom.java.analitics.SparkEngineTest.main(SparkEngineTest.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.conf.HiveConf
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 13 more




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/issue-creating-spark-context-with-CDH-5-3-1-tp21968.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: issue creating spark context with CDH 5.3.1

2015-03-09 Thread sachin Singh
I have copied  hive-site.xml to spark conf folder cp
/etc/hive/conf/hive-site.xml /usr/lib/spark/conf



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/issue-creating-spark-context-with-CDH-5-3-1-tp21968p21969.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: issue Running Spark Job on Yarn Cluster

2015-03-04 Thread sachin Singh
Not yet,
Please let. Me know if you found solution,

Regards
Sachin
On 4 Mar 2015 21:45, mael2210 [via Apache Spark User List] 
ml-node+s1001560n21909...@n3.nabble.com wrote:

 Hello,

 I am facing the exact same issue. Could you solve the problem ?

 Kind regards

 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-spark-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21697p21909.html
  To unsubscribe from issue Running Spark Job on Yarn Cluster, click here
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=21697code=c2FjaGluLnNoYXNoaUBnbWFpbC5jb218MjE2OTd8MTkyMzgyNjU3Mw==
 .
 NAML
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21697p21912.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: issue Running Spark Job on Yarn Cluster

2015-02-19 Thread Sachin Singh
Yes.
On 19 Feb 2015 23:40, Harshvardhan Chauhan ha...@gumgum.com wrote:

 Is this the full stack trace ?

 On Wed, Feb 18, 2015 at 2:39 AM, sachin Singh sachin.sha...@gmail.com
 wrote:

 Hi,
 I want to run my spark Job in Hadoop yarn Cluster mode,
 I am using below command -
 spark-submit --master yarn-cluster --driver-memory 1g --executor-memory 1g
 --executor-cores 1 --class com.dc.analysis.jobs.AggregationJob
 sparkanalitic.jar param1 param2 param3
 I am getting error as under, kindly suggest whats going wrong ,is command
 is
 proper or not ,thanks in advance,

 Exception in thread main org.apache.spark.SparkException: Application
 finished with failed status
 at
 org.apache.spark.deploy.yarn.ClientBase$class.run(ClientBase.scala:509)
 at org.apache.spark.deploy.yarn.Client.run(Client.scala:35)
 at org.apache.spark.deploy.yarn.Client$.main(Client.scala:139)
 at org.apache.spark.deploy.yarn.Client.main(Client.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21697.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --
 *Harshvardhan Chauhan*  |  Software Engineer
 *GumGum* http://www.gumgum.com/  |  *Ads that stick*
 310-260-9666  |  ha...@gumgum.com



issue Running Spark Job on Yarn Cluster

2015-02-18 Thread sachin Singh
Hi,
I want to run my spark Job in Hadoop yarn Cluster mode,
I am using below command -
spark-submit --master yarn-cluster --driver-memory 1g --executor-memory 1g
--executor-cores 1 --class com.dc.analysis.jobs.AggregationJob
sparkanalitic.jar param1 param2 param3
I am getting error as under, kindly suggest whats going wrong ,is command is
proper or not ,thanks in advance,

Exception in thread main org.apache.spark.SparkException: Application
finished with failed status
at
org.apache.spark.deploy.yarn.ClientBase$class.run(ClientBase.scala:509)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:35)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:139)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/issue-Running-Spark-Job-on-Yarn-Cluster-tp21697.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



how to get SchemaRDD SQL exceptions i.e. table not found exception

2015-02-13 Thread sachin Singh
Hi,
can some one guide how to get SQL Exception trapped for query executed using
SchemaRDD,
i mean suppose table not found

thanks in advance,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/how-to-get-SchemaRDD-SQL-exceptions-i-e-table-not-found-exception-tp21645.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



how to avoid Spark and Hive log from Application log

2015-02-11 Thread sachin Singh
Hi,
Please can somebody help ,how to avoid Spark and Hive log from Application
log,
I mean both spark and hive are using log4j property file ,
I have configured log4j.property file as per my application as under but its
printing Spark and hive console logging also,please suggest its urgent for
me, I am running application in HDFS environment

log4j.rootLogger=DEBUG,debugLog, SplLog

log4j.appender.debugLog=org.apache.log4j.RollingFileAppender
log4j.appender.debugLog.File=logs/Debug.log
log4j.appender.debugLog.MaxFileSize=10MB
log4j.appender.debugLog.MaxBackupIndex=10
log4j.appender.debugLog.layout=org.apache.log4j.PatternLayout
log4j.appender.debugLog.layout.ConversionPattern=%d{-MM-dd HH:mm:ss}
%-5p %c{1} - %m%n
log4j.appender.debugLog.filter.f1=org.apache.log4j.varia.LevelRangeFilter
log4j.appender.debugLog.filter.f1.LevelMax=DEBUG
log4j.appender.debugLog.filter.f1.LevelMin=DEBUG

log4j.appender.SplLog=org.apache.log4j.RollingFileAppender
log4j.appender.SplLog.File=logs/AppSplCmd.log
log4j.appender.SplLog.MaxFileSize=10MB
log4j.appender.SplLog.MaxBackupIndex=10
log4j.appender.SplLog.layout=org.apache.log4j.PatternLayout
log4j.appender.SplLog.layout.ConversionPattern=%d{-MM-dd HH:mm:ss} %-5p
%c{1} - %m%n
log4j.appender.SplLog.filter.f1=org.apache.log4j.varia.LevelRangeFilter
log4j.appender.SplLog.filter.f1.LevelMax=FATAL
log4j.appender.SplLog.filter.f1.LevelMin=INFO

log4j.logger.debugLogger=DEBUG, debugLog
log4j.additivity.debugLogger=false

log4j.logger.AppSplLogger=INFO, SplLog
log4j.additivity.AppSplLogger=false


Thanks in advance,







--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/how-to-avoid-Spark-and-Hive-log-from-Application-log-tp21615.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



getting error when submit spark with master as yarn

2015-02-07 Thread sachin Singh
Hi,
when I am trying to execute my program as 
spark-submit --master yarn --class com.mytestpack.analysis.SparkTest
sparktest-1.jar

I am getting error bellow error-
java.lang.IllegalArgumentException: Required executor memory (1024+384 MB)
is above the max threshold (1024 MB) of this cluster!
at
org.apache.spark.deploy.yarn.ClientBase$class.verifyClusterResources(ClientBase.scala:71)
at
org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:35)
at 
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:77)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
at org.apache.spark.SparkContext.init(SparkContext.scala:335)
at
org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:61)

I am new in Hadoop environment,
Please help how/where need to set memory or any configuration ,thanks in
advance,




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/getting-error-when-submit-spark-with-master-as-yarn-tp21542.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



how to send JavaDStream RDD using foreachRDD using Java

2015-02-01 Thread sachin Singh
Hi I want to send streaming data to kafka topic,
I am having RDD data which I converted in JavaDStream ,now I want to send it
to kafka topic, I don't want kafka sending code, just I need foreachRDD
implementation, my code is look like as
public void publishtoKafka(ITblStream t)
{
MyTopicProducer MTP =
ProducerFactory.createProducer(hostname+:+port);
JavaDStream? rdd = (JavaDStream?) t.getRDD();

rdd.foreachRDD(new FunctionString, String() {
@Override
public Void call(JavaRDDString rdd) throws Exception {
 KafkaUtils.sendDataAsString(MTP,topicName, String RDDData);
return null;
}
  });
log.debug(sent to kafka:
--);

}   

here myTopicproducer will create producer which is working fine
KafkaUtils.sendDataAsString is method which will publish data to kafka topic
is also working fine,

I have only one problem I am not able to convert JavaDStream rdd as string
using foreach or foreachRDD finally I need String message from rdds, kindly
suggest java code only and I dont want to use anonymous classes, Please send
me only the part to send JavaDStream RDD using foreachRDD using Function
Call

Thanks in advance,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/how-to-send-JavaDStream-RDD-using-foreachRDD-using-Java-tp21456.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Spark SQL implementation error

2014-12-30 Thread sachin Singh
I have a table(csv file) loaded data on that by creating POJO as per table
structure,and created SchemaRDD as under
JavaRDDTest1 testSchema =
sc.textFile(D:/testTable.csv).map(GetTableData);/* GetTableData will
transform the all table data in testTable object*/
JavaSchemaRDD schemaTest = sqlContext.applySchema(testSchema, Test.class);
schemaTest.registerTempTable(testTable);

JavaSchemaRDD sqlQuery = sqlContext.sql(SELECT * FROM testTable);
ListString totDuration = sqlQuery.map(new FunctionRow, String() {
  public String call(Row row) {
return Field1is :  + row.getInt(0);
  }
}).collect();
its working fine
but.
if I am changing query as(rest code is same)-  JavaSchemaRDD sqlQuery =
sqlContext.sql(SELECT sum(field1) FROM testTable group by field2); 
error as - Exception in thread main java.lang.NoSuchMethodError:
org.apache.spark.rdd.ShuffledRDD.init(Lorg/apache/spark/rdd/RDD;Lorg/apache/spark/Partitioner;)V

Please help and Suggest 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-implementation-error-tp20901.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



JavaRDD (Data Aggregation) based on key

2014-12-23 Thread sachin Singh
Hi, 
I have a csv file having fields as a,b,c . 
I want to do aggregation(sum,average..) based on any field(a,b or c) as per
user input, 
using Apache Spark Java API,Please Help Urgent!

Thanks in advance,

Regards
Sachin



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/JavaRDD-Data-Aggregation-based-on-key-tp20828.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org