Hi Renato,

2013/3/8 Renato Marroquín Mogrovejo <[email protected]>:
> My bad, yeah I think you should open different issues so we can have
> it on JIRA and get it done in the future. It's like the Pig adapter,
> it's been there for a while, but being on JIRA helps us all be aware
> of what needs to be improved or done.

Ah, ok :)

> Ok, so I am the one not understanding here man, sorry ):
> When you talk about multi type unions, are you referring  to nested
> records? or are you referring them as separate things?

Good question. I think we can think about:

- Optional-type-unions = ['null','type'] (are multitype, but only to
create "optional" fields)
- Multitype = Optional-type-unions + unions as ['type1','type2',...] +
unions as ['null','type1','type2',...]
- Nested = Subset of Multitypes where the outer record is one of the
inner types: record type1 with a field of type ['null','type1',...]

My HBase fix fixes all three.

>>> About Cassandra issues, the cloning process you are describing is
>>> problem that Roland was looking into, let's hope we can work that one
>>> out soon. The way Gora-Cassandra serializes data is what you've
>>> described in your first option, and I also think the second one is a
>>> better option.
>>
>> I am thinking about something different than you.
>>
>> The way that Gora-cassanda serializes is what is shown in
>> "Implementation details in Cassandra" excluding "Proposed
>> implementations are two:"
>> The way described in "Proposed implementatios > First option" is not
>> gora-cassandra, but the approach of HBase.
>> The way described in "Proposed implementatios > Second option", as you
>> say, seems much better.
>
> Man, I might be looking at the wrong place but in [1], but we always
> use a ByteBufferSerializer to store our data, we do use specific
> serializers to obtain ByteBuffers from the value in
> CassandraClient[2], but at the end we store HColumns with byteBuffers.
> And to retrieve the data, we use the json schema to know which
> serializer to use and be able to get the data as it originally was.

Gora-cassandra uses Hector serializers and Gora-cassandra serializers,
not Avro ones.
Concretely, Avro serializer can serialize unions without modification
(or suposedly).

As you can check, CassandraClient takes the serializer for each
field(and key) from here:

 this.keySerializer = GoraSerializerTypeInferer.getSerializer(keyClass);

that ultimately is Hector serializers and org.apache.gora.cassandra.serialzers.*
(notice me if I am wrong)

> Thanks again man!

Thank you!

> [1] 
> https://github.com/renato2099/gora/blob/trunk/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/HectorUtils.java
> [2] 
> https://github.com/renato2099/gora/blob/trunk/gora-cassandra/src/main/java/org/apache/gora/cassandra/store/CassandraClient.java

Regards,

Alfonso Nishikawa

Reply via email to