Hi Renato, 2013/3/8 Renato Marroquín Mogrovejo <[email protected]>: > My bad, yeah I think you should open different issues so we can have > it on JIRA and get it done in the future. It's like the Pig adapter, > it's been there for a while, but being on JIRA helps us all be aware > of what needs to be improved or done.
Ah, ok :) > Ok, so I am the one not understanding here man, sorry ): > When you talk about multi type unions, are you referring to nested > records? or are you referring them as separate things? Good question. I think we can think about: - Optional-type-unions = ['null','type'] (are multitype, but only to create "optional" fields) - Multitype = Optional-type-unions + unions as ['type1','type2',...] + unions as ['null','type1','type2',...] - Nested = Subset of Multitypes where the outer record is one of the inner types: record type1 with a field of type ['null','type1',...] My HBase fix fixes all three. >>> About Cassandra issues, the cloning process you are describing is >>> problem that Roland was looking into, let's hope we can work that one >>> out soon. The way Gora-Cassandra serializes data is what you've >>> described in your first option, and I also think the second one is a >>> better option. >> >> I am thinking about something different than you. >> >> The way that Gora-cassanda serializes is what is shown in >> "Implementation details in Cassandra" excluding "Proposed >> implementations are two:" >> The way described in "Proposed implementatios > First option" is not >> gora-cassandra, but the approach of HBase. >> The way described in "Proposed implementatios > Second option", as you >> say, seems much better. > > Man, I might be looking at the wrong place but in [1], but we always > use a ByteBufferSerializer to store our data, we do use specific > serializers to obtain ByteBuffers from the value in > CassandraClient[2], but at the end we store HColumns with byteBuffers. > And to retrieve the data, we use the json schema to know which > serializer to use and be able to get the data as it originally was. Gora-cassandra uses Hector serializers and Gora-cassandra serializers, not Avro ones. Concretely, Avro serializer can serialize unions without modification (or suposedly). As you can check, CassandraClient takes the serializer for each field(and key) from here: this.keySerializer = GoraSerializerTypeInferer.getSerializer(keyClass); that ultimately is Hector serializers and org.apache.gora.cassandra.serialzers.* (notice me if I am wrong) > Thanks again man! Thank you! > [1] > https://github.com/renato2099/gora/blob/trunk/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/HectorUtils.java > [2] > https://github.com/renato2099/gora/blob/trunk/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraClient.java Regards, Alfonso Nishikawa

