Thanks, will do that. ---- Saad
On Mon, Mar 12, 2018 at 12:14 PM, Ted Yu <[email protected]> wrote: > Saad: > I encourage you to open an HBase JIRA outlining your use case and the > config knobs you added through a patch. > > We can see the details for each config and make recommendation accordingly. > > Thanks > > On Mon, Mar 12, 2018 at 8:43 AM, Saad Mufti <[email protected]> wrote: > > > I have create a company specific branch and added 4 new flags to control > > this behavior, these gave us a huge performance boost when running Spark > > jobs on snapshots of very large tables in S3. I tried to do everything > > cleanly but > > > > a) not being familiar with the whole test strategies I haven't had time > to > > add any useful tests, though of course I left the default behavior the > > same, and a lot of the behavior I control wit these flags only affect > > performance, not the final result, so I would need some pointers on how > to > > add useful tests > > b) I added a new flag to be an overall override for prefetch behavior > that > > overrides any setting even in the column family descriptor, not sure if > > what I did was entirely in the spirit of what HBase does > > > > Again these if used properly would only impact jobs using > > TableSnapshotInputFormat in their Spark or M-R jobs. Would someone from > the > > core team be willing to look at my patch? I have never done this before, > so > > would appreciate a quick pointer on how to send a patch and get some > quick > > feedback. > > > > Cheers. > > > > ---- > > Saad > > > > > > > > On Sat, Mar 10, 2018 at 9:56 PM, Saad Mufti <[email protected]> > wrote: > > > > > The question remain though of why it is even accessing a column > family's > > > files that should be excluded based on the Scan. And that column family > > > does NOT specify prefetch on open in its schema. Only the one we want > to > > > read specifies prefetch on open, which we want to override if possible > > for > > > the Spark job. > > > > > > ---- > > > Saad > > > > > > > > > On Sat, Mar 10, 2018 at 9:51 PM, Saad Mufti <[email protected]> > > wrote: > > > > > >> See below more I found on item 3. > > >> > > >> Cheers. > > >> > > >> ---- > > >> Saad > > >> > > >> On Sat, Mar 10, 2018 at 7:17 PM, Saad Mufti <[email protected]> > > wrote: > > >> > > >>> Hi, > > >>> > > >>> I am running a Spark job (Spark 2.2.1) on an EMR cluster in AWS. > There > > >>> is no Hbase installed on the cluster, only HBase libs linked to my > > Spark > > >>> app. We are reading the snapshot info from a HBase folder in S3 using > > >>> TableSnapshotInputFormat class from HBase 1.4.0 to have the Spark job > > read > > >>> snapshot info directly from the S3 based filesystem instead of going > > >>> through any region server. > > >>> > > >>> I have observed a few behaviors while debugging performance that are > > >>> concerning, some we could mitigate and other I am looking for clarity > > on: > > >>> > > >>> 1) the TableSnapshotInputFormatImpl code is trying to get locality > > >>> information for the region splits, for a snapshots with a large > number > > of > > >>> files (over 350000 in our case) this causing single threaded scan of > > all > > >>> the file listings in a single thread in the driver. And it was > useless > > >>> because there is really no useful locality information to glean since > > all > > >>> the files are in S3 and not HDFS. So I was forced to make a copy of > > >>> TableSnapshotInputFormatImpl.java in our code and control this with > a > > >>> config setting I made up. That got rid of the hours long scan, so I > am > > good > > >>> with this part for now. > > >>> > > >>> 2) I have set a single column family in the Scan that I set on the > > hbase > > >>> configuration via > > >>> > > >>> scan.addFamily(str.getBytes())) > > >>> > > >>> hBaseConf.set(TableInputFormat.SCAN, convertScanToString(scan)) > > >>> > > >>> > > >>> But when this code is executing under Spark and I observe the threads > > >>> and logs on Spark executors, I it is reading from S3 files for a > column > > >>> family that was not included in the scan. This column family was > > >>> intentionally excluded because it is much larger than the others and > > so we > > >>> wanted to avoid the cost. > > >>> > > >>> Any advice on what I am doing wrong would be appreciated. > > >>> > > >>> 3) We also explicitly set caching of blocks to false on the scan, > > >>> although I see that in TableSnapshotInputFormatImpl.java it is again > > >>> set to false internally also. But when running the Spark job, some > > >>> executors were taking much longer than others, and when I observe > their > > >>> threads, I see periodic messages about a few hundred megs of RAM used > > by > > >>> the block cache, and the thread is sitting there reading data from > S3, > > and > > >>> is occasionally blocked a couple of other threads that have the > > >>> "hfile-prefetcher" name in them. Going back to 2) above, they seem to > > be > > >>> reading the wrong column family, but in this item I am more concerned > > about > > >>> why they appear to be prefetching blocks and caching them, when the > > Scan > > >>> object has a setting to not cache blocks at all? > > >>> > > >> > > >> I think I figured out item 3, the column family descriptor for the > table > > >> in question has prefetch on open set in its schema. Now for the Spark > > job, > > >> I don't think this serves any useful purpose does it? But I can't see > > any > > >> way to override it. If these is, I'd appreciate some advice. > > >> > > > > > >> Thanks. > > >> > > >> > > >>> > > >>> Thanks in advance for any insights anyone can provide. > > >>> > > >>> ---- > > >>> Saad > > >>> > > >>> > > >> > > >> > > > > > >
