-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8611/
-----------------------------------------------------------
(Updated Feb. 20, 2013, 7:59 p.m.)
Review request for giraph.
Changes
-------
javadocs. passes mvn install now.
Description
-------
One particular thing I added was the concept of "profiles", allowing for easily
reading / writing from multiple tables. This should remove a lot of the cruft
around the GiraphHCat* classes.
Note in the diff I separated the code so that there would be a Giraph-unrelated
Hive-only portion (under package org.apache.hadoop.hive). Things under this
package (and its children) do not touch any Giraph code, and so can be
contributed as an IOFormat back to Hive itself.
Also note the new (I think improved) interface: Users do not need to actually
implement an XInputFormat anymore. They just create a class the implements the
HiveToVertex (HiveToEdge, VertexToHive) interface, plug that in, and use
HiveVertexInputFormat. Should make user code much cleaner.
This addresses bug GIRAPH-453.
https://issues.apache.org/jira/browse/GIRAPH-453
Diffs (updated)
-----
giraph-accumulo/pom.xml cb9fbc02e6fc8adcb0ec41e0c6aeff75b1ef3f06
giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyClient.java
89ef87fea7a370354156fb7be02ef4249e0a6111
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java
ddeaeb769b548eb1002ccf8c18ffe048eb096f8d
giraph-hbase/pom.xml 7bbbd98c0b3db6878aee4be21eecd821448da7ef
giraph-hcatalog/pom.xml 019f02083012704a997ffe715cefe3adeb153dd9
giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HCatGiraphRunner.java
PRE-CREATION
giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HiveGiraphRunner.java
313bab04c50ed6be7143254de80e36a4ba291516
giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HiveUtils.java
c1f76f1a46d1fc9af489a916256884520c138cb4
giraph-hive/pom.xml PRE-CREATION
giraph-hive/src/main/assembly/compile.xml PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/common/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveToEdge.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveToVertex.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexReader.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexWriter.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/output/VertexToHive.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/output/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/giraph/hive/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/HiveReadableRecord.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/HiveRecord.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/HiveTableSchema.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/HiveTableSchemaAware.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/HiveTableSchemas.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/HiveWritableRecord.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/HiveApiRecord.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/HiveApiTableSchema.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/Classes.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/FileSystems.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/HadoopUtils.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/HiveMetastores.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/HiveUtils.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/Inspectors.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/ProgressReporter.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/SerDes.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/Writables.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/common/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/HiveApiInputSplit.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/HiveApiRecordReader.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/InputConf.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/InputInfo.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/InputPartition.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/InputSplitData.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/NoOpInputObserver.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/benchmark/BenchmarkArgs.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/benchmark/CounterRatioGauge.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/benchmark/InputBenchmark.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/benchmark/MetricsObserver.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/benchmark/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/input/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/output/HiveApiOutputCommitter.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/output/HiveApiRecordWriter.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/output/NoOpOutputObserver.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/output/OutputConf.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/output/OutputInfo.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/output/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/impl/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/input/HiveApiInputFormat.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/input/HiveApiInputObserver.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/input/HiveInputDescription.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/input/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/output/HiveApiOutputFormat.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/output/HiveApiOutputObserver.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/output/HiveOutputDescription.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/output/package-info.java
PRE-CREATION
giraph-hive/src/main/java/org/apache/hadoop/hive/api/package-info.java
PRE-CREATION
pom.xml c075762cddd7a698c92aaad4017cd74915160e41
Diff: https://reviews.apache.org/r/8611/diff/
Testing
-------
Ran on some production jobs and verified results were exactly the same.
Here's a comparison of performance on real work loads ("base" is hcatalog,
"mine" is hive):
https://gist.github.com/nitay/880d8fb20d2ac86015d4/raw/6b297fcb287bf8d3dc8175bad217aa86544b4f18/high+school
Basically we see slight improvement which is expected because I haven't done a
lot in terms of performance yet.
There are few performance improvement ideas coming, this is just the first
working version.
Thanks,
Nitay Joffe