Re: TableSnapshotInputFormat Behavior In HBase 1.4.0

Saad Mufti Mon, 12 Mar 2018 10:31:47 -0700

Thanks, will do that.

----
Saad



On Mon, Mar 12, 2018 at 12:14 PM, Ted Yu <[email protected]> wrote:

> Saad:
> I encourage you to open an HBase JIRA outlining your use case and the
> config knobs you added through a patch.
>
> We can see the details for each config and make recommendation accordingly.
>
> Thanks
>
> On Mon, Mar 12, 2018 at 8:43 AM, Saad Mufti <[email protected]> wrote:
>
> > I have create a company specific branch and added 4 new flags to control
> > this behavior, these gave us a huge performance boost when running Spark
> > jobs on snapshots of very large tables in S3. I tried to do everything
> > cleanly but
> >
> > a) not being familiar with the whole test strategies I haven't had time
> to
> > add any useful tests, though of course I left the default behavior the
> > same, and a lot of the behavior I control wit these flags only affect
> > performance, not the final result, so I would need some pointers on how
> to
> > add useful tests
> > b) I added a new flag to be an overall override for prefetch behavior
> that
> > overrides any setting even in the column family descriptor, not sure if
> > what I did was entirely in the spirit of what HBase does
> >
> > Again these if used properly would only impact jobs using
> > TableSnapshotInputFormat in their Spark or M-R jobs. Would someone from
> the
> > core team be willing to look at my patch? I have never done this before,
> so
> > would appreciate a quick pointer on how to send a patch and get some
> quick
> > feedback.
> >
> > Cheers.
> >
> > ----
> > Saad
> >
> >
> >
> > On Sat, Mar 10, 2018 at 9:56 PM, Saad Mufti <[email protected]>
> wrote:
> >
> > > The question remain though of why it is even accessing a column
> family's
> > > files that should be excluded based on the Scan. And that column family
> > > does NOT specify prefetch on open in its schema. Only the one we want
> to
> > > read specifies prefetch on open, which we want to override if possible
> > for
> > > the Spark job.
> > >
> > > ----
> > > Saad
> > >
> > >
> > > On Sat, Mar 10, 2018 at 9:51 PM, Saad Mufti <[email protected]>
> > wrote:
> > >
> > >> See below more I found on item 3.
> > >>
> > >> Cheers.
> > >>
> > >> ----
> > >> Saad
> > >>
> > >> On Sat, Mar 10, 2018 at 7:17 PM, Saad Mufti <[email protected]>
> > wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> I am running a Spark job (Spark 2.2.1) on an EMR cluster in AWS.
> There
> > >>> is no Hbase installed on the cluster, only HBase libs linked to my
> > Spark
> > >>> app. We are reading the snapshot info from a HBase folder in S3 using
> > >>> TableSnapshotInputFormat class from HBase 1.4.0 to have the Spark job
> > read
> > >>> snapshot info directly from the S3 based filesystem instead of going
> > >>> through any region server.
> > >>>
> > >>> I have observed a few behaviors while debugging performance that are
> > >>> concerning, some we could mitigate and other I am looking for clarity
> > on:
> > >>>
> > >>> 1)  the TableSnapshotInputFormatImpl code is trying to get locality
> > >>> information for the region splits, for a snapshots with a large
> number
> > of
> > >>> files (over 350000 in our case) this causing single threaded scan of
> > all
> > >>> the file listings in a single thread in the driver. And it was
> useless
> > >>> because there is really no useful locality information to glean since
> > all
> > >>> the files are in S3 and not HDFS. So I was forced to make a copy of
> > >>> TableSnapshotInputFormatImpl.java in our code and control this with
> a
> > >>> config setting I made up. That got rid of the hours long scan, so I
> am
> > good
> > >>> with this part for now.
> > >>>
> > >>> 2) I have set a single column family in the Scan that I set on the
> > hbase
> > >>> configuration via
> > >>>
> > >>> scan.addFamily(str.getBytes()))
> > >>>
> > >>> hBaseConf.set(TableInputFormat.SCAN, convertScanToString(scan))
> > >>>
> > >>>
> > >>> But when this code is executing under Spark and I observe the threads
> > >>> and logs on Spark executors, I it is reading from S3 files for a
> column
> > >>> family that was not included in the scan. This column family was
> > >>> intentionally excluded because it is much larger than the others and
> > so we
> > >>> wanted to avoid the cost.
> > >>>
> > >>> Any advice on what I am doing wrong would be appreciated.
> > >>>
> > >>> 3) We also explicitly set caching of blocks to false on the scan,
> > >>> although I see that in TableSnapshotInputFormatImpl.java it is again
> > >>> set to false internally also. But when running the Spark job, some
> > >>> executors were taking much longer than others, and when I observe
> their
> > >>> threads, I see periodic messages about a few hundred megs of RAM used
> > by
> > >>> the block cache, and the thread is sitting there reading data from
> S3,
> > and
> > >>> is occasionally blocked a couple of other threads that have the
> > >>> "hfile-prefetcher" name in them. Going back to 2) above, they seem to
> > be
> > >>> reading the wrong column family, but in this item I am more concerned
> > about
> > >>> why they appear to be prefetching blocks and caching them, when the
> > Scan
> > >>> object has a setting to not cache blocks at all?
> > >>>
> > >>
> > >> I think I figured out item 3, the column family descriptor for the
> table
> > >> in question has prefetch on open set in its schema. Now for the Spark
> > job,
> > >> I don't think this serves any useful purpose does it? But I can't see
> > any
> > >> way to override it. If these is, I'd appreciate some advice.
> > >>
> > >
> > >> Thanks.
> > >>
> > >>
> > >>>
> > >>> Thanks in advance for any insights anyone can provide.
> > >>>
> > >>> ----
> > >>> Saad
> > >>>
> > >>>
> > >>
> > >>
> > >
> >
>

Re: TableSnapshotInputFormat Behavior In HBase 1.4.0

Reply via email to