Hi again Alfonso (:

More comments.

2013/3/7 Alfonso Nishikawa <[email protected]>:
> Hi Roland,
>
>> I've read over the part concerning cassandra.
>> Have you seen GORA-211 and our discussion about cloning there?
>> Can you explain a bit more what you're thinking about here:
>> "Wrongly creates a new Persistent "by hand" instead using
>> PersistentBase#clone()"
>> I'm relativity sure that my last problem from NUTCH-1534 (the
>> InvalidRequestException(why:column name must not be empty)) is located
>> somewhere in the cloning code from gora-cassandra, but I can't find it
>> right now.
>
> Sure, explanation going :)
> Gora-0.2.1 @ CassandraStore.java#put():286 does this: "* Duplicate
> instance to keep all the objects in memory till flushing."
>
> Some minor important things.
>
> First creates a new empty instance with:
>
>  T p = (T) value.newInstance(new StateManagerImpl());
>
> but actually should have been created with:
>
>  T p = this.getBeanFactory().newPersistent() ;

Could you please explain why this is a better creational approach? Do
you know how HBase module does this? IMO We should make all data
stores use at least a similar approach.

> But the real thing is that all that method should be implemented as
> following (cloning is done in
> PersistentDatumReader#clone(Persistent,Schema):215) :
>
>  public void put(K key, T value) {
>    this.buffer.put(key, value.clone()) ;
>  }
>
> But this is not really important, I guess.

Isn't this mainly for MapReduce access?

> Anyway, this does not seems to be the problem shown in NUTCH-1534.
> I think that the problem in NUTCH-1534 what you told about multiple
> threads. CassandraClient is not reentrant because Mutator is not
> reentrant, so must be used only with 1 thread. Could you, please, try
> this?:
>
> * Update to gora-0.2.1
> * Modify CassandraStore:340 so the line reads as this:
>
>  private synchronized void addOrUpdateField(K key, Field field, Object value) 
> {
>
> The same should be for gora-0.2 (at CassandraStore:301), but I like
> 0.2.1 and patches must be for /trunk (desirable).
>
> Maybe I am wrong, but please, give it a shot :)

Yeah, there are several different approaches to get this accomplished.
For what I recall from GORA-211, Roland suggested creating a lock
object and just synchronizing at the read/write/update operation time,
but we would have to evaluate if it causes any performance damage to
synchronize the whole encapsulating method.


Renato M.

> Regards,
>
> Alfonso Nishikawa
>
> 2013/3/7 Roland <[email protected]>:
>> Hi Alfonso,
>>
>> I've read over the part concerning cassandra.
>> Have you seen GORA-211 and our discussion about cloning there?
>> Can you explain a bit more what you're thinking about here:
>> "Wrongly creates a new Persistent "by hand" instead using
>> PersistentBase#clone()"
>>
>> I'm relativity sure that my last problem from NUTCH-1534 (the
>> InvalidRequestException(why:column name must not be empty)) is located
>> somewhere in the cloning code from gora-cassandra, but I can't find it right
>> now.
>>
>> Thanks a lot for this write-up,
>> Roland
>>
>> Am 06.03.2013 11:53, schrieb Alfonso Nishikawa:
>>
>>> Hello everybody,
>>>
>>> I finally finished some important notes. I would like to have you reviewed
>>> and commented :)
>>> https://people.apache.org/~alfonsonishikawa/gora-174-notes.html
>>>
>>> Thank you!
>>>
>>> Regards,
>>>
>>> Alfonso Nishikawa
>>>
>>
>
>
>
> --
> "Drinking bloody marys all night will make you feel like a corpse in
> the morning."

Reply via email to