> On Feb 12, 2015, at 12:37 AM, Robert Coli <[email protected]> wrote:
> 
> On Wed, Feb 11, 2015 at 2:22 AM, Pavel Velikhov <[email protected] 
> <mailto:[email protected]>> wrote:
>   2. While trying to update the full dataset with a simple transformation 
> (again via python driver), single node and clustered Cassandra run out of 
> memory no matter what settings I try, even I put a lot of sleeps into the 
> mix. However simpler transformations (updating just one column, specially 
> when there is a lot of processing overhead) work just fine.
> 
> What does a "simple transformation" mean here? Assuming a reasonable sized 
> heap, OOM sounds like you're trying to update a large number of large 
> partitions in a single operation.
> 
> In general, in Cassandra, you're best off interacting with a single or small 
> number of partitions in any given interaction.
> 
> =Rob
> 

Hi Robert!

  Simple transformation is changing just a single column value (for I usually 
do it for the whole dataset).
  But when I was running out of memory, I was reading in 5 columns and updating 
3. Some of them could be big, but I need to check and rerun this case.
  (I worked around this by dumping to files and then scanning the files and 
updating the database, but this stinks!)

  I don’t quite understand the fundamentals of Cassandra - if I’m just doing 
one scan with a reasonable number of columns that I fetch, and I’m updating at 
the same time, what’s happening there? Why eat up so much memory and die? 

Reply via email to