Hi Renato, > " . . . Gora should be low invasive: data schema is created and stored > out of the backend so ideally you could access your data without Gora. > We will see that this is hard to achieve at some extent (like in > nested records or several types unions)." > > You can access your data directly without using Gora e.g. your Nutch > data can be queried and retrieved by using HBase clients, Cassandra > clients, Hadoop, or anything. Is this what you meant?
Not exactly. We may have to make data accesible without Gora. Generally speaking, in 0.2.1 this is possible except for some serialized things (usually records). After union types, this becomes worse, but we can make it accesible creating a new configuration option. In this case the access would be schema-less. The other point of view is accessing full data from outside Gora but having the schema definition. Surely there are cases where this is possible. As I told, this is hard to achieve at some extent, and not all backends would support this. > About Nested records, oh man, your descriptions are really > interesting! (: and you are right about the possible approaches, would > you mind opening a JIRA issue to keep track of this? I mean for > complex nested data structures. Thanks! Sorry, I don't understand. Should I open an issue? What I wrote were only descriptions and thoughts :P > So when you are talking about implementing this on HBase, are you > still talking about handling null-one-type-unions (GORA-174)? or are > you talking about the nested features described before? Both. optional-singletype unions and multitypes unions. > About Cassandra issues, the cloning process you are describing is > problem that Roland was looking into, let's hope we can work that one > out soon. The way Gora-Cassandra serializes data is what you've > described in your first option, and I also think the second one is a > better option. I am thinking about something different than you. The way that Gora-cassanda serializes is what is shown in "Implementation details in Cassandra" excluding "Proposed implementations are two:" The way described in "Proposed implementatios > First option" is not gora-cassandra, but the approach of HBase. The way described in "Proposed implementatios > Second option", as you say, seems much better. > Did you happen to see that email Lewis sent about > plugglable client architecture for Gora-Cassandra? the idea with this > would be to create these type of abstractions. Good to see we are all > in the same page (: I read something on the fly, but not analyzed it, so really I don't have oppinion. At this moment I found gora quite pluggable: you put a .jar with your backend in class library and configure gora.properties to use your classes. I logically guess you are talking about further topics :) Will read the abstract ;) > I will open different JIRA issues to track all these problems > separately in order to make smaller more digestible patches and start > committing them and getting 0.3 out! I find right how is now with a main issue for common code, and one per backend, but feel free. > Just one last question about GORA-174, do you remember that some tests > were not passing after applying the patch? Well after applying > GORA-174 + GORA-182 + GORA-206, there were not any under > gora-xxx/target/surefire-reports/, is this what is expected? Of course > after GORA-206 we've noticed that there are many other problems, but I > think we should start moving along. It is expected no errors under target/surefire-reports. Good if gora-cassdra has no one, but must be errors in other backend, and some errors related to core. I am preparing my patches, I solved some more bugs, and I found more bugs I have to solve. > Thanks again, and keep up with the great work! Thank you with your feedback! Regards, Alfonso Nishikawa

