[ 
https://issues.apache.org/jira/browse/GORA-419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502113#comment-14502113
 ] 

ASF GitHub Bot commented on GORA-419:
-------------------------------------

Github user renato2099 commented on the pull request:

    https://github.com/apache/gora/pull/23#issuecomment-94309903
  
    Thanks a lot for the explanation @gerhardgossen! And yes this is a problem 
we have seen in other data stores as well. I mean managing complex data types 
because not all data stores provide the same functionality. For example, in 
gora-cassandra depending on your mapping file, you could create subcolumns 
inside a super column or even separated columns. Then when updating maps, you 
could end up updating a whole column even when a single value was modified 
inside an array or map. This behaviour is of course wrong. I guess this is also 
happening in accumulo per your test.
    I think there is a trade-off here between generating a column for each 
specific value of a map/array which leads to a more complex scan operation or 
using a single column to store them all which leads to the current behaviour.
    In Cassandra, arrays and maps can be now stored natively, so I guess we 
will be using them soon instead of adding this extra "mapping" complexity. Do 
you know if Accumulo stores complex data types or if it plans to?
    



> AccumuloStore.put deletes entire row when updating map/array field
> ------------------------------------------------------------------
>
>                 Key: GORA-419
>                 URL: https://issues.apache.org/jira/browse/GORA-419
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: gora-accumulo
>    Affects Versions: 0.5, 0.6
>         Environment:     Gora 0.5
>     Accumulo 1.5.1
>     Zookeeper 3.4.6
>     Hadoop 1.2.1
>            Reporter: Gerhard Gossen
>            Priority: Critical
>
> In {{AccumuloStore.put(k, v)}} fields of type MAP or ARRAY are cleared first 
> before they are set to the new value. This is done in the methods 
> {{putMap}}/{{putArray}} using a call to {{deleteByQuery(q)}}. The name for 
> fields to be deleted is taken from the current column. However, 
> {{deleteByQuery}} tries to translate the field names of the query to column 
> names again, which fails with a log message like
> {code}
> 2015-04-13 13:43:35.084 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore    : Mapping not found for field: ol
> 2015-04-13 13:43:35.104 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore    : Mapping not found for field: mk
> 2015-04-13 13:43:35.115 ERROR 16733 --- [ool-46-thread-1] 
> o.a.gora.accumulo.store.AccumuloStore    : Mapping not found for field: mtdt
> {code}
> As a result, the query is not restricted to any field and the *entire row is 
> deleted*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to