For JsonSplit i am using just "$" to try and get the array into individual 
objects.  It worked on a small subset, but a large seems to just hang.

From: [email protected]
Date: Thu, 24 Sep 2015 18:54:06 -0400
Subject: Re: Array into MongoDB
To: [email protected]

Bryan is correct about the backing library reading everything into memory to do 
the evaluation.
Might I ask what the expression you are using?
On Thu, Sep 24, 2015 at 6:44 PM, Adam Williams <[email protected]> 
wrote:



I tried it even with 6GB and no luck.  It's receiving the flowfiles, but 
nothing is happening after.  If i do it with a small subset (3 JSON objects) it 
works perfect.  When i throw the 180MB file it just spins, no logging, errors 
etc very odd.  Any thoughts?
Thanks

From: [email protected]
To: [email protected]
Subject: RE: Array into MongoDB
Date: Thu, 24 Sep 2015 21:23:35 +0000




Bryan,
I think that is whats happening, fans spinning like crazy, this is my current 
bootstrap.conf.  I will bump it up, are there any other settings i should bump 
too?
java.arg.2=-Xms512m
java.arg.3=-Xmx2048m
Thanks

Date: Thu, 24 Sep 2015 17:20:27 -0400
Subject: Re: Array into MongoDB
From: [email protected]
To: [email protected]

One other thing I thought of... I think the JSON processors read the entire 
FlowFile content into memory to do the splitting/evaluating, so I wonder if you 
are running into a memory issue with a 180MB JSON file.
Are you running with the default configuration of 512mb set in 
conf/bootstrap.conf ?  If so it would be interesting to see what happens if you 
bump that up.
On Thu, Sep 24, 2015 at 5:06 PM, Bryan Bende <[email protected]> wrote:
Adam,
Based on that message I suspect that MongoDB does not support sending in an 
array of documents since it looks like it expect the first character to be the 
start of a document and not an array.
With regards to the SplitJson processor, if you set the JSON Path to $ then it 
should split at the top-level and send out each of your two documents on the 
splits relationship.
-Bryan

On Thu, Sep 24, 2015 at 4:36 PM, Adam Williams <[email protected]> 
wrote:



I have an array of JSON object I am trying to put into Mongo, but I keep 
hitting this on the PutMongo processor:
ERROR [Timer-Driven Process Thread-1] o.a.nifi.processors.mongodb.PutMongo 
PutMongo[id=c576f8cc-6e21-4881-a7cd-6e3881838a91] Failed to insert 
StandardFlowFileRecord[uuid=2c670a40-7934-4bc6-b054-1cba23fe7b0f,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1443125646319-1, container=default, 
section=1], offset=0, length=208380820],offset=0,name=test.json,size=208380820] 
into MongoDB due to org.bson.BsonInvalidOperationException: readStartDocument 
can only be called when CurrentBSONType is DOCUMENT, not when CurrentBSONType 
is ARRAY.: org.bson.BsonInvalidOperationException: readStartDocument can only 
be called when CurrentBSONType is DOCUMENT, not when CurrentBSONType is ARRAY.

I tried to use the splitJson processor to split the array into segments, but to 
my experience I can't pull out each Json Obect.  The splitjson processor just 
hangs and never produces logs or any output at all.  The structure of my data 
is:
[{"id":1, "stat":"something"},{"id":2, "stat":"anothersomething"}]
The JSON file itself is pretty large (>100mb).
Thank you                                         



                                                                                
  

                                          

Reply via email to