Re: Large dataset on hbase

Bryan Bende Tue, 12 Apr 2016 01:29:52 -0700

Hi Prabhu,

How did you end up converting your CSV into JSON?


PutHBaseJSON creates a single row from a JSON document. In your example
above, using n1 as the rowId, it would create a row with columns n2 - n22.
Are you seeing columns missing, or are you missing whole rows from your
original CSV?

Thanks,

Bryan



On Mon, Apr 11, 2016 at 11:43 AM, prabhu Mahendran <prabhuu161...@gmail.com>
wrote:

> Hi Simon/Joe,
>
> Thanks for this support.
> I have successfully converted the CSV data into JSON and also insert those
> JSON data into Hbase Table using PutHBaseJSon.
> Part of JSON Sample Data like below:
>
> {
> "n1":"<value>",
> "n2":"<value>",
> "n3":"<value>",
> "n4":"<value>","n5":"<value>","n6":"<value>",
> "n7":"<value>",
> "n8":"<value>",
> "n9":"<value>",
>
> "n10":"<value>","n11":"<value>","n12":"<value>","n13":"<value>","n14":"<value>","n15":"<value>","n16":"<value>",
>
> "n17":"<value>","n18":"<value>","n19":"<value>","n20":"<value>","n21":"-<value>",
> "n22":"<value>"
>
> }
> PutHBaseJSON:
>                    Table Name is 'Hike' , Column Family:'Sweet' ,Row
> Identifier Field Name:n1(Element in JSON File).
>
> My Record Contains 15 lacks rows but HBaseTable contains only 10 rows.
> It Can Read the 15 lacks rows but stores minimum rows.
>
> Anyone please help me to solve this?
>
>
>
>
> Prabhu,
>
> If the dataset being processed can be split up and still retain the
> necessary meaning when input to HBase I'd recommend doing that.  NiFI
> itself, as a framework, can handle very large objects because its API
> doesn't force loading of entire objects into memory.  However, various
> processors may do that and I believe ReplaceText may be one that does.
> You can use SplitText or ExecuteScript or other processors to do that
> splitting if that will help your case.
>
> Thanks
> Joe
>
> On Sat, Apr 9, 2016 at 6:35 PM, Simon Ball <sb...@hortonworks.com> wrote:
> > Hi Prabhu,
> >
> > Did you try increasing the heap size in conf/bootstrap.conf? By default
> nifi
> > uses a very small RAM allocation (512MB). You can increase this by
> tweaking
> > java.arg.2 and .3 in the bootstrap.conf file. Note that this is the java
> > heap, so you will need more than your data size to account for java
> object
> > overhead. The other thing to check is the buffer sizes you are using for
> > your replace text processors. If you’re also using Split processors, you
> can
> > sometime run up against RAM and open file limits, if this is the case,
> make
> > sure you increase the ulimit -n settings.
> >
> > Simon
> >
> > On 9 Apr 2016, at 16:51, prabhu Mahendran <prabhuu161...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I am new to nifi and does not know how to process large data like one gb
> csv
> > data into hbase.while try combination of getFile and putHbase shell leads
> > Java Out of memory error and also try combination of replace text,
> extract
> > text and puthbasejson doesn't work on large dataset but it work
> correctly in
> > smaller dataset.
> > Can anyone please help me to solve this?
> > Thanks in advance.
> >
> > Thanks & Regards,
> > Prabhu Mahendran
> >
> >
>

Re: Large dataset on hbase

Reply via email to