Hello Everyone, Just to add to this thread, it became plain obvious to me that improvements/simplifications could be made here when I ran into problems when trying to address the issue with gora-cassandra tests. HBase store seems to use Configuration and hbase-site.xml, Cassandra store needs to use properties and cassandra.yaml, Sql store needs to use properties etc.
I see Ferdy's suggestion to deprecate the properties object moving forward, instead making the Gora API prefer Configuration. Although my knowledge regarding Pig etc is limited at this stage, I can only assume that this also ties in with our future vision to improve the analysis aspect of the Gora API. Thanks On Tue, May 1, 2012 at 12:16 PM, Ferdy Galema <[email protected]> wrote: > Hi, > > While Lewis and I were discussing over NUTCH-1205, we identified > the Properties object as the major source of trouble/confusion when > configuring datastores. First and foremost, it's makes no sense that we > have 2 ways to configure a store, namely via Configuration and Properties. > Besides this there seems to be some trouble with the serialization and > initialization of the stores. (Sometimes runtime Properties settings are > not correctly used). We have a few ways to solve this problem: > > -Stop supporting the adding of dynamic properties (runtime settings) and > only support the static gora.properties file. People wanting to use runtime > props shall use Configuration somehow. > -Completely remove the Properties object from Gora altogether. Migrate > existing properties to Configuration. > -Trying to make both Properties and Configuration work (that's what the > current direction seems to be). Difficult it seems. > -Something else? > > I think the second option is best. The advantage of Configuration is that > it inheritently works with mapreduce because it is automatically > (de)serialized and available in the mappers and reducers. What do you think? > > Ferdy. -- Lewis

