> On Oct. 31, 2013, 8:07 p.m., Avery Ching wrote:
> > +1, this is awesome work Maja and will fail faster due to metastore issues 
> > and also cut back on metastore accesses.  Yay!

Thanks for a quick review, added comments and committing!


- Maja


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15142/#review27948
-----------------------------------------------------------


On Oct. 31, 2013, 6:43 p.m., Maja Kabiljo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15142/
> -----------------------------------------------------------
> 
> (Updated Oct. 31, 2013, 6:43 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Bugs: GIRAPH-789
>     https://issues.apache.org/jira/browse/GIRAPH-789
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> Currently each worker is sending multiple requests to metastore to get info 
> about io formats, which is unnecessary and can cause issues when metastore is 
> having problems.
> 
> Hive-io changed so it doesn't access metastore when schema/table info is 
> already present in Configuration, and HiveGiraphRunner is now initializing 
> all the formats to fill up the Configuration. If HiveGiraphRunner is not used 
> everything will still work, but we'll have accesses to metastore from workers.
> 
> 
> Diffs
> -----
> 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 
> 6b8a8e9 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveUtils.java 
> b809413 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java
>  534a773 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java
>  d5c1279 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java
>  c4813fb 
>   pom.xml f2981ff 
> 
> Diff: https://reviews.apache.org/r/15142/diff/
> 
> 
> Testing
> -------
> 
> mvn clean verify
> 
> Run jobs with single and multiple input formats, with added logging for each 
> metastore call in hive-io. For example in case when we have single vertex and 
> edge input and output, we'll have none instead of 8 metastore calls from each 
> worker. The number of calls from master is also reduced - we are only getting 
> input partition descriptions in the beginning of the job and have no calls in 
> the end (for output). The only call left in the end is from cleanup task to 
> register new partition. Clean up task used to have two additional calls which 
> are also removed.
> 
> 
> Thanks,
> 
> Maja Kabiljo
> 
>

Reply via email to