Re: How does wholeTextFiles() work in Spark-Hadoop Cluster?

2016-09-21 Thread Nisha Menon
Well I have already tried that. You are talking about a command similar to this right? *yarn logs -applicationId application_Number * This gives me the processing logs, that contain information about the tasks, RDD blocks etc. What I really need is the output log that gets generated as part of

Re: How does wholeTextFiles() work in Spark-Hadoop Cluster?

2016-09-21 Thread ayan guha
On yarn, logs are aggregated from each containers to hdfs. You can use yarn CLI or ui to view. For spark, you would have a history server which consolidate s the logs On 21 Sep 2016 19:03, "Nisha Menon" wrote: > I looked at the driver logs, that reminded me that I needed

How does wholeTextFiles() work in Spark-Hadoop Cluster?

2016-09-21 Thread Nisha Menon
I looked at the driver logs, that reminded me that I needed to look at the executor logs. There the issue was that the spark executors were not getting a configuration file. I broadcasted the file and now the processing happens. Thanks for the suggestion. Currently my issue is that the log file

Re: How does wholeTextFiles() work in Spark-Hadoop Cluster?

2016-09-08 Thread Sonal Goyal
Are you looking at the worker logs or the driver? On Thursday, September 8, 2016, Nisha Menon wrote: > I have an RDD created as follows: > > *JavaPairRDD inputDataFiles = > sparkContext.wholeTextFiles("hdfs://ip:8020/user/cdhuser/inputFolder/");* > >

How does wholeTextFiles() work in Spark-Hadoop Cluster?

2016-09-08 Thread Nisha Menon
I have an RDD created as follows: *JavaPairRDD inputDataFiles = sparkContext.wholeTextFiles("hdfs://ip:8020/user/cdhuser/inputFolder/");* On this RDD I perform a map to process individual files and invoke a foreach to trigger the same map. * JavaRDD output =