Re: Spark log4j fully qualified class name

2016-02-27 Thread Prabhu Joseph
Ted, That was a useful information. But i want it to use in development cluster to learn the internals through the logs. On Using %C, actual FQCN is not printed instead org.apache.spark.Logging overwrites. Is it a intended change in Spark. On Sun, Feb 28, 2016 at 3:46 AM, Ted Yu wrote: >

Re: Spark log4j fully qualified class name

2016-02-27 Thread Ted Yu
Looking at https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html *WARNING* Generating the caller class information is slow. Thus, use should be avoided unless execution speed is not an issue. On Sat, Feb 27, 2016 at 12:40 PM, Prabhu Joseph wrote: > Hi All, > > Whe

spark yarn exec container fails if yarn.nodemanager.local-dirs value starts with file://

2016-02-27 Thread Alexander Pivovarov
Spark yarn executor container fails if yarn.nodemanager.local-dirs starts with file:// yarn.nodemanager.local-dirs file:///data01/yarn/nm,file:///data02/yarn/nm other application, e.g. Hadoop MR and Hive work normally Spark works only if yarn.nodemanager.local-dirs does not hav

Spark log4j fully qualified class name

2016-02-27 Thread Prabhu Joseph
Hi All, When i change the spark log4j.properties conversion pattern to know the fully qualified class name, all the logs has the FQCN as org.apache.spark.Logging. The actual fully qualified class name is overwritten. log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p *%C

Re: some joins stopped working with spark 2.0.0 SNAPSHOT

2016-02-27 Thread Jonathan Kelly
If you want to find what commit caused it, try out the "git bisect" command. On Sat, Feb 27, 2016 at 11:06 AM Koert Kuipers wrote: > https://issues.apache.org/jira/browse/SPARK-13531 > > On Sat, Feb 27, 2016 at 3:49 AM, Reynold Xin wrote: > >> Can you file a JIRA ticket? >> >> >> On Friday, Febr

Re: some joins stopped working with spark 2.0.0 SNAPSHOT

2016-02-27 Thread Koert Kuipers
https://issues.apache.org/jira/browse/SPARK-13531 On Sat, Feb 27, 2016 at 3:49 AM, Reynold Xin wrote: > Can you file a JIRA ticket? > > > On Friday, February 26, 2016, Koert Kuipers wrote: > >> dataframe df1: >> schema: >> StructType(StructField(x,IntegerType,true)) >> explain: >> == Physical P

Re: More Robust DataSource Parameters

2016-02-27 Thread Hamel Kothari
Thanks for the flags Reynold. 1. For the 4+ languages, these are just on the consumption side (i.e. you can't write a data source in Python or SQL, correct), right? ? If this is correct and you can only write data sources in the JVM languages than that makes this story a lot easier. On the DataSou

beeline and spark-defaults.conf

2016-02-27 Thread longsonr
I'd like to be able to pass a java define to beeline (to configure jline). I tried setting spark.driver.extraJavaOptions in my spark-defaults.conf file, however beeline does not seem to read this file, it picks up settings from the SPARK_JAVA_OPTS environment variable instead. If I changed the

Spark Checkpointing behavior

2016-02-27 Thread Tarek Elgamal
Hi, I am trying to understand the behavior of rdd.checkpoint() in Spark. I am running the JavaPageRank example on a 1 GB graph and I am checkpointing the *ranks *rdd inside each iterati

Re: Is spark.driver.maxResultSize used correctly ?

2016-02-27 Thread Reynold Xin
But sometimes you might have skew and almost all the result data are in one or a few tasks though. On Friday, February 26, 2016, Jeff Zhang wrote: > > My job get this exception very easily even when I set large value of > spark.driver.maxResultSize. After checking the spark code, I found > spark

Re: some joins stopped working with spark 2.0.0 SNAPSHOT

2016-02-27 Thread Reynold Xin
Can you file a JIRA ticket? On Friday, February 26, 2016, Koert Kuipers wrote: > dataframe df1: > schema: > StructType(StructField(x,IntegerType,true)) > explain: > == Physical Plan == > MapPartitions , obj#135: object, [if (input[0, > object].isNullAt) null else input[0, object].get AS x#128] >