json2sstable fails due to OutOfMemory
-------------------------------------

                 Key: CASSANDRA-2189
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2189
             Project: Cassandra
          Issue Type: Bug
          Components: Tools
    Affects Versions: 0.7.2
         Environment: linux
            Reporter: Shotaro Kamio
            Priority: Minor


I have a json file created with sstable2json for a column family of super 
column type. Its size is about 1.9GB. (It's a dump of all keys because I cannot 
find out how to specify keys to dump in sstable2json.)
When I tried to create sstable from the json file, it failed with 
OutOfMemoryError as follows.

 WARN 00:31:58,595 Schema definitions were defined both locally and in 
cassandra.yaml. Definitions in cassandra.yaml were ignored.
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
        at java.lang.String.intern(Native Method)
        at org.codehaus.jackson.util.InternCache.intern(InternCache.java:40)
        at 
org.codehaus.jackson.sym.BytesToNameCanonicalizer.addName(BytesToNameCanonicalizer.java:471)
        at 
org.codehaus.jackson.impl.Utf8StreamParser.addName(Utf8StreamParser.java:893)
        at 
org.codehaus.jackson.impl.Utf8StreamParser.findName(Utf8StreamParser.java:773)
        at 
org.codehaus.jackson.impl.Utf8StreamParser.parseLongFieldName(Utf8StreamParser.java:379)
        at 
org.codehaus.jackson.impl.Utf8StreamParser.parseMediumFieldName(Utf8StreamParser.java:347)
        at 
org.codehaus.jackson.impl.Utf8StreamParser._parseFieldName(Utf8StreamParser.java:304)
        at 
org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:140)
        at 
org.codehaus.jackson.map.deser.UntypedObjectDeserializer.mapObject(UntypedObjectDeserializer.java:93)
        at 
org.codehaus.jackson.map.deser.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:65)
        at 
org.codehaus.jackson.map.deser.MapDeserializer._readAndBind(MapDeserializer.java:197)
        at 
org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:145)
        at 
org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:23)
        at 
org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:1261)
        at 
org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:517)
        at org.codehaus.jackson.JsonParser.readValueAs(JsonParser.java:897)
        at 
org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport.java:208)
        at 
org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:197)
        at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:421)

So, what I had to is that split the json file with "split" command and modify 
them to be correct json file. Create sstable for each small json files.

Could you change json2sstable to avoid OutOfMemory?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to