Re: Giraph (1.1.0-SNAPSHOT and 1.0.0-RC3) unit tests fail

2014-07-01 Thread Roman Shaposhnik
Yes, the failures around Accumulo in hadoop_2 profile are expected and nothing
to worry about. I should've probably mentioned it in my RC announcement email.
Sorry about that.

Any failures in hadoop_1 profile would be a reason to reconsider RC0.

Thanks,
Roman.

P.S. This is one of the reasons we're still running with hadoop_1 as a default
profile.

On Mon, Jun 30, 2014 at 3:09 AM, Akila Wajirasena
akila.wajiras...@gmail.com wrote:
 Hi Roman,

 I got the same error when running hadoop_2 profile.
 According to this [1] the Accumulo version we use in giraph (1.4) is not
 compatible with Hadoop 2.
 I think this is the issue.

 [1]
 http://apache-accumulo.1065345.n5.nabble.com/Accumulo-Hadoop-version-compatibility-matrix-tp3893p3894.html

 Thanks

 Akila


 On Mon, Jun 30, 2014 at 2:21 PM, Toshio ITO toshio9@toshiba.co.jp
 wrote:

 Hi Roman.

 I checked out release-1.1.0-RC0 and succeeded to build it.

 $ git checkout release-1.1.0-RC0
 $ mvn clean
 $ mvn package -Phadoop_2 -DskipTests
 ## SUCCESS

 However, when I ran the tests with LocalJobRunner, it failed.

 $ mvn clean
 $ mvn package -Phadoop_2

 It passed tests from Core and Examples, but it failed at
 Accumulo I/O.


 testAccumuloInputOutput(org.apache.giraph.io.accumulo.TestAccumuloVertexFormat)

 The error log contained the following exception

 java.lang.IncompatibleClassChangeError: Found interface
 org.apache.hadoop.mapreduce.JobContext, but class was expected


 Next I wanted to run the tests with a running Hadoop2 instance, but
 I'm having trouble to set it up (I'm quite new to Hadoop).

 Could you show me some example configuration (etc/hadoop/* files) of
 Hadoop 2.2.0 single-node cluster? That would be very helpful.




 
  On Sun, Jun 29, 2014 at 5:06 PM, Toshio ITO toshio9@toshiba.co.jp
  wrote:
   Hi Roman.
  
   Thanks for the reply.
  
   OK, I'll try hadoop_1 and hadoop_2 with the latest
   release-1.1.0-RC0 and report the result.
 
  That would be extremely helpful!
 
  And speaking of which -- I'd like to remind folks
  that taking RC0 for a spin would really help
  at this point. If we ever want to have 1.1.0 out
  we need the required PMC votes.
 
  Thanks,
  Roman.
 
 Toshio Ito









Re: Review Request 23140: Fix checkpointing

2014-07-01 Thread Sergey Edunov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/
---

(Updated July 2, 2014, 12:57 a.m.)


Review request for giraph.


Changes
---

I removed aggregators serialization from MasterCompute and workers.


Repository: giraph-git


Description
---

This fix merely makes checkpointing work again. 


Diffs (updated)
-

  
giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 
9613805 
  giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 
  giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2 
  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2 
  giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 
ad7e045 
  
giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 
325d91f 
  
giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 
545d1af 
  
giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java
 240687e 
  
giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java
 d833895 
  
giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java
 50c750a 
  giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 
3454d62 
  giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 
0ac74da 
  
giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java
 f128f34 
  
giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java
 3c0de44 
  
giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java
 004ea81 
  giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 
09dd46d 
  
giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java
 af45426 
  giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 
8dcf19a 
  giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db 
  
giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java
 96bd5d7 
  
giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java
 e2b611b 
  
giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/23140/diff/


Testing
---

I tested it running multiple different jobs. I run page rank on 2*10^9 vertices 
on 200 workers and it seems to work just fine. It only takes 2 minutes to save 
checkpoint. 


Thanks,

Sergey Edunov



[jira] [Updated] (GIRAPH-924) Fix checkpointing

2014-07-01 Thread Sergey Edunov (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Edunov updated GIRAPH-924:
-

Attachment: GIRAPH-924.patch

 Fix checkpointing
 -

 Key: GIRAPH-924
 URL: https://issues.apache.org/jira/browse/GIRAPH-924
 Project: Giraph
  Issue Type: Improvement
Reporter: Sergey Edunov
 Attachments: GIRAPH-924.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 We need to make checkpoiting in Giraph functional again - it misses a lot of 
 data because of many additions we've been making to Giraph (like information 
 from WorkerContext/MasterCompute, proper integration with per superstep 
 output etc).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23140: Fix checkpointing

2014-07-01 Thread Maja Kabiljo

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/#review47169
---


Thanks, much shorter now. Should we add some tests to make sure things don't 
get broken again?


giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java
https://reviews.apache.org/r/23140/#comment82778

Why ignore superstep 0? For example there might be a lot of filtering going 
on during input superstep and it's cheaper to restart from checkpoint than read 
all the data again



giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java
https://reviews.apache.org/r/23140/#comment82781

Interesting, where do we rely on this?



giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java
https://reviews.apache.org/r/23140/#comment82777

Nice bug ;-)



giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
https://reviews.apache.org/r/23140/#comment82787

This is what output threads are called, please name these differently



giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java
https://reviews.apache.org/r/23140/#comment82775

We are not using Serializable - what's transient here for?



giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java
https://reviews.apache.org/r/23140/#comment82772

Why did you move this file?


- Maja Kabiljo


On July 2, 2014, 12:57 a.m., Sergey Edunov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/23140/
 ---
 
 (Updated July 2, 2014, 12:57 a.m.)
 
 
 Review request for giraph.
 
 
 Repository: giraph-git
 
 
 Description
 ---
 
 This fix merely makes checkpointing work again. 
 
 
 Diffs
 -
 
   
 giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java
  9613805 
   giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373 
   giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2 
   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 
 7d7ceb2 
   giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 
 ad7e045 
   
 giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java
  325d91f 
   
 giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java
  545d1af 
   
 giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java
  240687e 
   
 giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java
  d833895 
   
 giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java
  50c750a 
   
 giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 
 3454d62 
   giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 
 0ac74da 
   
 giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java
  f128f34 
   
 giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java
  3c0de44 
   
 giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java
  004ea81 
   giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 
 09dd46d 
   
 giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java
  af45426 
   giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 
 8dcf19a 
   giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 
 17347db 
   
 giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java
  96bd5d7 
   
 giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java
  e2b611b 
   
 giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/23140/diff/
 
 
 Testing
 ---
 
 I tested it running multiple different jobs. I run page rank on 2*10^9 
 vertices on 200 workers and it seems to work just fine. It only takes 2 
 minutes to save checkpoint. 
 
 
 Thanks,
 
 Sergey Edunov