Re: Giraph (1.1.0-SNAPSHOT and 1.0.0-RC3) unit tests fail
Yes, the failures around Accumulo in hadoop_2 profile are expected and nothing to worry about. I should've probably mentioned it in my RC announcement email. Sorry about that. Any failures in hadoop_1 profile would be a reason to reconsider RC0. Thanks, Roman. P.S. This is one of the reasons we're still running with hadoop_1 as a default profile. On Mon, Jun 30, 2014 at 3:09 AM, Akila Wajirasena akila.wajiras...@gmail.com wrote: Hi Roman, I got the same error when running hadoop_2 profile. According to this [1] the Accumulo version we use in giraph (1.4) is not compatible with Hadoop 2. I think this is the issue. [1] http://apache-accumulo.1065345.n5.nabble.com/Accumulo-Hadoop-version-compatibility-matrix-tp3893p3894.html Thanks Akila On Mon, Jun 30, 2014 at 2:21 PM, Toshio ITO toshio9@toshiba.co.jp wrote: Hi Roman. I checked out release-1.1.0-RC0 and succeeded to build it. $ git checkout release-1.1.0-RC0 $ mvn clean $ mvn package -Phadoop_2 -DskipTests ## SUCCESS However, when I ran the tests with LocalJobRunner, it failed. $ mvn clean $ mvn package -Phadoop_2 It passed tests from Core and Examples, but it failed at Accumulo I/O. testAccumuloInputOutput(org.apache.giraph.io.accumulo.TestAccumuloVertexFormat) The error log contained the following exception java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected Next I wanted to run the tests with a running Hadoop2 instance, but I'm having trouble to set it up (I'm quite new to Hadoop). Could you show me some example configuration (etc/hadoop/* files) of Hadoop 2.2.0 single-node cluster? That would be very helpful. On Sun, Jun 29, 2014 at 5:06 PM, Toshio ITO toshio9@toshiba.co.jp wrote: Hi Roman. Thanks for the reply. OK, I'll try hadoop_1 and hadoop_2 with the latest release-1.1.0-RC0 and report the result. That would be extremely helpful! And speaking of which -- I'd like to remind folks that taking RC0 for a spin would really help at this point. If we ever want to have 1.1.0 out we need the required PMC votes. Thanks, Roman. Toshio Ito
Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 2, 2014, 12:57 a.m.) Review request for giraph. Changes --- I removed aggregators serialization from MasterCompute and workers. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs (updated) - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2 giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045 giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java PRE-CREATION Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov
[jira] [Updated] (GIRAPH-924) Fix checkpointing
[ https://issues.apache.org/jira/browse/GIRAPH-924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Edunov updated GIRAPH-924: - Attachment: GIRAPH-924.patch Fix checkpointing - Key: GIRAPH-924 URL: https://issues.apache.org/jira/browse/GIRAPH-924 Project: Giraph Issue Type: Improvement Reporter: Sergey Edunov Attachments: GIRAPH-924.patch Original Estimate: 336h Remaining Estimate: 336h We need to make checkpoiting in Giraph functional again - it misses a lot of data because of many additions we've been making to Giraph (like information from WorkerContext/MasterCompute, proper integration with per superstep output etc). -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23140: Fix checkpointing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/#review47169 --- Thanks, much shorter now. Should we add some tests to make sure things don't get broken again? giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java https://reviews.apache.org/r/23140/#comment82778 Why ignore superstep 0? For example there might be a lot of filtering going on during input superstep and it's cheaper to restart from checkpoint than read all the data again giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java https://reviews.apache.org/r/23140/#comment82781 Interesting, where do we rely on this? giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java https://reviews.apache.org/r/23140/#comment82777 Nice bug ;-) giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java https://reviews.apache.org/r/23140/#comment82787 This is what output threads are called, please name these differently giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java https://reviews.apache.org/r/23140/#comment82775 We are not using Serializable - what's transient here for? giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java https://reviews.apache.org/r/23140/#comment82772 Why did you move this file? - Maja Kabiljo On July 2, 2014, 12:57 a.m., Sergey Edunov wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23140/ --- (Updated July 2, 2014, 12:57 a.m.) Review request for giraph. Repository: giraph-git Description --- This fix merely makes checkpointing work again. Diffs - giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805 giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2 giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045 giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62 giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81 giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7 giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java PRE-CREATION Diff: https://reviews.apache.org/r/23140/diff/ Testing --- I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint. Thanks, Sergey Edunov