Hi All,
We have set up 2 node cluster (NODE-DSRV05 and NODE-DSRV02) each is
having 32gb RAM and 1 TB hard disk capacity and 8 cores of cpu. We have set
up hdfs which has 2 TB capacity and the block size is 256 mb When we try
to process 1 gb file on spark, we see the following exception
It shows nullPointerException, your data could be corrupted? Try putting a
try catch inside the operation that you are doing, Are you running the
worker process on the master node also? If not, then only 1 node will be
doing the processing. If yes, then try setting the level of parallelism and
Hi Quizhuang - you have two options:
1) Within the map step define a validation function that will be executed
on every record.
2) Use the filter function to create a filtered dataset prior to
processing.
On 11/14/14, 10:28 AM, Qiuzhuang Lian qiuzhuang.l...@gmail.com wrote:
Hi,
MapReduce has
I noticed Spark 1.2.0-SNAPSHOT still has 2.4.x in the pom. Since 2.5.x is
the current stable Hadoop 2.x, would it make sense for us to update the
poms?
I don't think it's necessary. You're looking at the hadoop-2.4
profile, which works with anything = 2.4. AFAIK there is no further
specialization needed beyond that. The profile sets hadoop.version to
2.4.0 by default, but this can be overridden.
On Fri, Nov 14, 2014 at 3:43 PM, Corey Nolet
In the past, I've built it by providing -Dhadoop.version=2.5.1 exactly like
you've mentioned. What prompted me to write this email was that I did not
see any documentation that told me Hadoop 2.5.1 was officially supported by
Spark (i.e. community has been using it, any bugs are being fixed,
You're the second person to request this today. Planning to include this in my
PR for Spark-4338.
-Sandy
On Nov 14, 2014, at 8:48 AM, Corey Nolet cjno...@gmail.com wrote:
In the past, I've built it by providing -Dhadoop.version=2.5.1 exactly like
you've mentioned. What prompted me to write
Yeah I think someone even just suggested that today in a separate
thread? couldn't hurt to just add an example.
On Fri, Nov 14, 2014 at 4:48 PM, Corey Nolet cjno...@gmail.com wrote:
In the past, I've built it by providing -Dhadoop.version=2.5.1 exactly like
you've mentioned. What prompted me to
Hi all, since the vote ends on a Sunday, please let me know if you would
like to extend the deadline to allow more time for testing.
2014-11-13 12:10 GMT-08:00 Sean Owen so...@cloudera.com:
Ah right. This is because I'm running Java 8. This was fixed in
SPARK-3329 (
+1
Tested on Mac OS X, and verified that sort-based shuffle bug is fixed.
Matei
On Nov 14, 2014, at 10:45 AM, Andrew Or and...@databricks.com wrote:
Hi all, since the vote ends on a Sunday, please let me know if you would
like to extend the deadline to allow more time for testing.
A recent patch broke clean builds for me, I am trying to see how
widespread this issue is and whether we need to revert the patch.
The error I've seen is this when building the examples project:
spark-examples_2.10: Could not resolve dependencies for project
A work around for this fix is identified here:
http://dbknickerbocker.blogspot.com/2013/04/simple-fix-to-missing-toolsjar-in-jdk.html
However, if this affects more users I'd prefer to just fix it properly
in our build.
On Fri, Nov 14, 2014 at 12:17 PM, Patrick Wendell pwend...@gmail.com wrote:
Seems like a comment on that page mentions a fix, which would add yet another
profile though — specifically telling mvn that if it is an apple jdk, use the
classes.jar as the tools.jar as well, since Apple-packaged JDK 6 bundled them
together.
Link:
I think in this case we can probably just drop that dependency, so
there is a simpler fix. But mostly I'm curious whether anyone else has
observed this.
On Fri, Nov 14, 2014 at 12:24 PM, Hari Shreedharan
hshreedha...@cloudera.com wrote:
Seems like a comment on that page mentions a fix, which
+0
I expect to start testing on Monday but won't have enough results to change
my vote from +0
until Monday night or Tuesday morning.
Thanks,
Zach
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-1-RC1-tp9311p9370.html
+1
Tested HiveThriftServer2 against Hive 0.12.0 on Mac OS X. Known issues
are fixed. Hive version inspection works as expected.
On 11/15/14 8:25 AM, Zach Fry wrote:
+0
I expect to start testing on Monday but won't have enough results to change
my vote from +0
until Monday night or Tuesday
16 matches
Mail list logo