Re: Is there a way to do conditional group by in spark 2.1.1?

2017-06-10 Thread vaquar khan
Avoid groupby and use reducebykey. Regards, Vaquar khan On Jun 4, 2017 8:32 AM, "Guy Cohen" wrote: > Try this one: > > df.groupBy( > when(expr("field1='foo'"),"field1").when(expr("field2='bar'"),"field2")) > > > On Sun, Jun 4, 2017 at 3:16 AM, Bryan Jeffrey

Re: Spark Job is stuck at SUBMITTED when set Driver Memory > Executor Memory

2017-06-10 Thread vaquar khan
You can add memory in your command make sure given memory available on your executor ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master spark://207.184.161.138:7077 \ --executor-memory 20G \ --total-executor-cores 100 \ /path/to/examples.jar \ 1000

Re: [Spark JDBC] Does spark support read from remote Hive server via JDBC

2017-06-10 Thread vaquar khan
Hi , Pleaae check your firewall security setting sharing link one good link. http://belablotski.blogspot.in/2016/01/access-hive-tables-from-spark-using.html?m=1 Regards, Vaquar khan On Jun 8, 2017 1:53 AM, "Patrik Medvedev" wrote: > Hello guys, > > Can somebody

Re: Scala, Python or Java for Spark programming

2017-06-10 Thread vaquar khan
It's depends on programming style ,I would like to say setup few rules to avoid complex code in scala , if needed ask programmer to add proper comments. Regards, Vaquar khan On Jun 8, 2017 4:17 AM, "JB Data" wrote: > Java is Object langage borned to Data, Python is Data

Re: Read Data From NFS

2017-06-10 Thread vaquar khan
Hi Ayan, If you have multiple files (example 12 files )and you are using following code then you will get 12 partition. r = sc.textFile("file://my/file/*") Not sure what you want to know about file system ,please check API doc. Regards, Vaquar khan On Jun 8, 2017 10:44 AM, "ayan guha"

Re: problem initiating spark context with pyspark

2017-06-10 Thread Felix Cheung
Curtis, assuming you are running a somewhat recent windows version you would not have access to c:\tmp, in your command example winutils.exe ls -F C:\tmp\hive Try changing the path to under your user directory. Running Spark on Windows should work :) From:

Re: problem initiating spark context with pyspark

2017-06-10 Thread Marco Mistroni
Ha...it's a 1 off.I run spk on Ubuntu and docker on windows...I don't think spark and windows are best friends.  On Jun 10, 2017 6:36 PM, "Gourav Sengupta" wrote: > seeing for the very first time someone try SPARK on Windows :) > > On Thu, Jun 8, 2017 at

Re: problem initiating spark context with pyspark

2017-06-10 Thread Gourav Sengupta
seeing for the very first time someone try SPARK on Windows :) On Thu, Jun 8, 2017 at 8:38 PM, Marco Mistroni wrote: > try this link > > http://letstalkspark.blogspot.co.uk/2016/02/getting-started- > with-spark-on-window-64.html > > it helped me when i had similar problems

[jira] Lantao Jin shared "SPARK-21023: Ignore to load default properties file is not a good choice from the perspective of system" with you

2017-06-10 Thread Lantao Jin (JIRA)
Lantao Jin shared an issue with you Hi all, Do you think is it a bug? Should we keep the current behavior still? > Ignore to load default properties file is not a good choice from the > perspective of system >