> On April 5, 2013, 6:17 p.m., Alessandro Presta wrote: > > My idea was actually to make the base classes (VertexReader etc) > > default-configurable, exactly to avoid the code duplication seen here. They > > will have to be turned from interfaces to abstract base classes (hopefully > > we don't have any code that relies on them being interfaces). What do you > > think? > > Also, are you sure the input formats actually need to be configurable? Only > > the readers/writers need to access special Giraph methods to instantiate > > vertices etc. The input formats generally just need to read some options, > > and that can be done from the context passed in initialize(). > > Same for AggregatorWriter, it doesn't seem to need any of the special > > methods. > > Avery Ching wrote: > Changing the interfaces to abstract classes that are default configurable > for readers/writers is reasonable. > > I think for flexibility it would be good to make the input formats > configurable (i.e. getSplits could depend on some configuration). > > I prefer us to use ImmutableClassesGiraphConfiguration as opposed to > plain Configuration just to make it easier for the user to get > Giraph-specific configuration and it doesn't hurt anything. > > How does that sound? > > I'll start making the changes for the interfaces -> abstract classes > immediately though. > > Alessandro Presta wrote: > What's an example of Giraph-specific configuration needed in an input > format? > If there isn't one, I'm still of the opinion that there's no point in > changing (isn't the whole point of ICGC that it makes instantiating certain > user-defined classes faster?). > Anyway my major point is reducing code duplication by making that > interface -> abstract class change, assuming it is doable. > > Avery Ching wrote: > ICGC doesn't only provide classes, it also provides access to the huge > list of options in GiraphConfiguration. > > Alessandro Presta wrote: > Then we should provide GiraphConfiguration instead (or better yet, merge > the two, which would be less confusing). > Anyway I think Giraph-specific options are meant to be used by Giraph > internals, not by user-defined classes. Maybe I'm wrong on this, but then > please provide a concrete example.
ImmutableClassesGiraphConfiguration is just an optimized GiraphConfiguration for classes. Let's think about merging/splitting functionality in another JIRA. I can't find any examples we have in the code. I can't think of any particular cases, but for instance, you wanted to know where the zk instance was running (say we wanted to get/store some extra information there). If we are going to provide a getConf() to the user, I don't see why it would be advantageous to only Configuration vs ICGC in general. - Avery ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10297/#review18711 ----------------------------------------------------------- On April 5, 2013, 7:24 a.m., Avery Ching wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/10297/ > ----------------------------------------------------------- > > (Updated April 5, 2013, 7:24 a.m.) > > > Review request for giraph. > > > Description > ------- > > All input/output formats and readers/writers now are > ImmutableClassesGiraphConfigurable, which means that we no longer need to > create your own ImmutableClassesGiraphConfiguration per format/reader/writer, > which is a bad way of doing things. This simplifies the code and makes it a > little easier for users to write their own formats/readers/writers. > > > This addresses bug GIRAPH-564. > https://issues.apache.org/jira/browse/GIRAPH-564 > > > Diffs > ----- > > > giraph-accumulo/src/main/java/org/apache/giraph/io/accumulo/AccumuloVertexInputFormat.java > 28dc951a973630bf259f4a937d9c15d2cac2597a > > giraph-accumulo/src/main/java/org/apache/giraph/io/accumulo/AccumuloVertexOutputFormat.java > 182afad8bf7c7dab4b1bf698b5965b1615e44bcb > > giraph-accumulo/src/test/java/org/apache/giraph/io/accumulo/edgemarker/AccumuloEdgeInputFormat.java > 4e76b747112900e95c716a1484df065b2257e7bd > > giraph-accumulo/src/test/java/org/apache/giraph/io/accumulo/edgemarker/AccumuloEdgeOutputFormat.java > 0937b8427dabe089ac800ce8977015fed69fc987 > > giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWriter.java > c1c66787ff7991737f4cf9b5f08e23ae8010ba85 > > giraph-core/src/main/java/org/apache/giraph/aggregators/TextAggregatorWriter.java > ef4371404f264e7d11ea7e1041c8ed55d550b42c > giraph-core/src/main/java/org/apache/giraph/io/BasicVertexValueReader.java > 1aed963f56ea00d4070f878f346181956c66fef6 > giraph-core/src/main/java/org/apache/giraph/io/EdgeInputFormat.java > 87ea5a0eeee5dd994857e8b181afeb8e9b066417 > giraph-core/src/main/java/org/apache/giraph/io/EdgeReader.java > f6dccdc5f63746fbe6f7d96706f8e311edc74808 > giraph-core/src/main/java/org/apache/giraph/io/GiraphInputFormat.java > 6b175a285553b1451320f8f89fe78e48e167988b > giraph-core/src/main/java/org/apache/giraph/io/ReverseEdgeDuplicator.java > d6cb88612a3fde4043839da09f0b7d27809324bd > giraph-core/src/main/java/org/apache/giraph/io/VertexInputFormat.java > 5aface5dce9f4c7d6979acee58a9585a89656bba > giraph-core/src/main/java/org/apache/giraph/io/VertexOutputFormat.java > 270d0402373c8385db2d634d090276805689cfa3 > giraph-core/src/main/java/org/apache/giraph/io/VertexReader.java > ef4ded6a64f3e3337b9b50da325a14fd36f2b4c3 > giraph-core/src/main/java/org/apache/giraph/io/VertexValueReader.java > 25e609303234b363d08de0c4e62b5d4e620fab14 > giraph-core/src/main/java/org/apache/giraph/io/VertexWriter.java > 2aa3a71e416c27fb8bec1536cd303235526c86d8 > > giraph-core/src/main/java/org/apache/giraph/io/formats/AdjacencyListTextVertexInputFormat.java > 7af4a71b34777d87bf5efd3bc0f1fa0278450fb4 > > giraph-core/src/main/java/org/apache/giraph/io/formats/AdjacencyListTextVertexOutputFormat.java > 581540315c69ee93dc212d378004739b3269045a > > giraph-core/src/main/java/org/apache/giraph/io/formats/IdWithValueTextOutputFormat.java > fe4a1d59c56da48bce8c95e1f693e1994a549a2e > > giraph-core/src/main/java/org/apache/giraph/io/formats/IntNullReverseTextEdgeInputFormat.java > 02703480d93ce7dc2dd3c35e1ff9140ba43da9cf > > giraph-core/src/main/java/org/apache/giraph/io/formats/PseudoRandomEdgeInputFormat.java > cd454e3c36235ac07b43e9700e511915bd8e4a47 > > giraph-core/src/main/java/org/apache/giraph/io/formats/PseudoRandomVertexInputFormat.java > c261f728b434c5cd7619e5276bda9ecd976f9de4 > > giraph-core/src/main/java/org/apache/giraph/io/formats/SequenceFileVertexInputFormat.java > 3ffda1fd5f80421fd627b43f3b1e107334f7137d > > giraph-core/src/main/java/org/apache/giraph/io/formats/SequenceFileVertexOutputFormat.java > 468e6bdebaad5d0ddadcf5d8abd72135b439b188 > > giraph-core/src/main/java/org/apache/giraph/io/formats/TextEdgeInputFormat.java > 0aae894a69fd18d390d8f3a4fec5615232c38ef7 > > giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexInputFormat.java > e47bda37d68481e8a0abe02f984df13758f1ca41 > > giraph-core/src/main/java/org/apache/giraph/io/formats/TextVertexOutputFormat.java > ad96cfe07d5ace11fce46f2af46745d09d1631db > > giraph-core/src/main/java/org/apache/giraph/io/iterables/EdgeReaderWrapper.java > 4af221a22bdf98b079c403dcb4bd8f36b0ddc0f1 > > giraph-core/src/main/java/org/apache/giraph/io/iterables/VertexReaderWrapper.java > 7e695ac99873dffc9c63edb8091870f5ceaa748f > > giraph-core/src/main/java/org/apache/giraph/io/superstep_output/MultiThreadedSuperstepOutput.java > a09f915c294c82de73234db1699cf6480cde8530 > > giraph-core/src/main/java/org/apache/giraph/io/superstep_output/SynchronizedSuperstepOutput.java > 9617b24907f7f53b7f16819ab997116369c290f3 > > giraph-core/src/main/java/org/apache/giraph/utils/InMemoryVertexInputFormat.java > 8e56eac4b49caddb635966f8204cab9a5ae88d15 > giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java > 2ea91b5d0f2d3d97d63a3ad96e04df5fa26d849b > > giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java > 017257ad770f97c8541a80fb132190c39cb08138 > > giraph-core/src/main/java/org/apache/giraph/worker/VertexInputSplitsCallable.java > c18648ba0dfc19b6a33f311578de99b5252f36fd > > giraph-core/src/test/java/org/apache/giraph/io/TestAdjacencyListTextVertexOutputFormat.java > 60bfac1036abd9b8d29c965585ab3ace531dbf49 > > giraph-core/src/test/java/org/apache/giraph/io/TestIdWithValueTextOutputFormat.java > 3fb4bbec98a243d9fbf9bb6231b9c7010d63ef48 > > giraph-core/src/test/java/org/apache/giraph/io/TestLongDoubleDoubleAdjacencyListVertexInputFormat.java > 34140886afef90b21c1eb6d6236a72bed967b849 > > giraph-core/src/test/java/org/apache/giraph/io/TestTextDoubleDoubleAdjacencyListVertexInputFormat.java > 7aea9ce24a9bd3bdc3a4b34259761ccd82dfb989 > > giraph-examples/src/main/java/org/apache/giraph/examples/GeneratedVertexReader.java > 9e4cb83659ec4382a9c45179ef8e4e1c73acb49a > > giraph-examples/src/main/java/org/apache/giraph/examples/SimpleAggregatorWriter.java > 188762105c15bfd54cfb34c40dad021ab0fbcc84 > > giraph-examples/src/main/java/org/apache/giraph/examples/SimplePageRankVertex.java > 73dc9a971be53e2695916f05e3f9d0175014fab3 > > giraph-examples/src/main/java/org/apache/giraph/examples/SimpleSuperstepVertex.java > df0359a61c5f69f51cea28230ba89e311ec2df0b > > giraph-hbase/src/main/java/org/apache/giraph/io/hbase/HBaseVertexInputFormat.java > 590ecb03315d7389a7cd87b34d704b8017b2b83f > > giraph-hbase/src/main/java/org/apache/giraph/io/hbase/HBaseVertexOutputFormat.java > 297139708e7006371a9e4a4d3accb8e58a14224d > > giraph-hbase/src/test/java/org/apache/giraph/io/hbase/TestHBaseRootMarkerVertextFormat.java > 4bd5dcfba2d401c796b268485a089aaade2320ca > > giraph-hbase/src/test/java/org/apache/giraph/io/hbase/edgemarker/TableEdgeInputFormat.java > 55e506b3794cf90748581251dd7f453baa1e853d > > giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HCatalogEdgeInputFormat.java > 96ec77f6831cda0419cc1e2a3ab82234843abefb > > giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HCatalogVertexInputFormat.java > 9dea18eaa3210f6a24bef5592acfcff289e975d8 > > giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java > 041e331c79f19bba5c44234256a4192edd791353 > > giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java > 9093603dba77c567300e82f847b762f573256fdf > > giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java > 1f87257cba0fc0c5dd80f3f204da5eff8fb948f2 > > giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexReader.java > 318e83fd19d8c9fc79234e948938f0ce58966e2e > > giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java > 9ff74d633151b1a53a7cd16635032351e16f477a > > giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexWriter.java > 45d0226ddf8828c9587e6904009ffee6c045ad13 > > Diff: https://reviews.apache.org/r/10297/diff/ > > > Testing > ------- > > mvn clean verify > Ran pagerank benchmark on a real cluster. > > > Thanks, > > Avery Ching > >
