[ 
https://issues.apache.org/jira/browse/CASSANDRA-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2322.
---------------------------------------

    Resolution: Won't Fix

SSTables must be sorted, so importing unsorted data has to sort.  If it's too 
big for memory you can either split it up or use normal Thrift inserts (or 
possibly CASSANDRA-1278 when it's done).  Adding disk-based sorting for a 
fourth option is a low-value proposition.

> json2sstable OOMs when trying to import large, unsorted JSON files
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-2322
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2322
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 0.7.4
>            Reporter: Jason Harvey
>            Priority: Minor
>
> Looks like the importUnsorted function tries to read in the entire file when 
> determining the keyCountToImport. Any way it could do so without having to 
> consume a huge amount of memory on large files?
> My large SSTables became unsorted somehow so I exported via sstable2json, and 
> I am trying to re-import. I can get by via splitting the JSON files up, but 
> others may not know of this limitation and waste a decent amount of time 
> learning it. A reliable import of large JSON files would be very handy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to