Re: Including asynchbase in the sourcetree

2016-09-28 Thread DO YUNG YOON
Hi, Jong Wook.

1. Good point.

I think  we set column family when create GetRequest(
https://github.com/apache/incubator-s2graph/blob/master/s2core/src/main/scala/org/apache/s2graph/core/storage/hbase/AsynchbaseStorage.scala#L249)
 and Scanner(
https://github.com/apache/incubator-s2graph/blob/master/s2core/src/main/scala/org/apache/s2graph/core/storage/hbase/AsynchbaseStorage.scala#L202
).

So ColumnPaginateFilter itself is not same with storeLimit/Offset
functionality wise, but in our case, it should suffice because we only
scanning one column family at a time.

2. I think your suggestion is much simpler and +1 on avoiding touching
HBaseClient.java.

So as far as I understand, we can do following steps.

1. change custom asynchbase dependency to official asynchbase.
2. make changes on s2graph core to apply Jong Wook's suggestion.

Please give any other thought if you guys have different opinion on this.

@JongWook, Can you create JIRA issue and send PR for this?(unless object).






On Thu, Sep 29, 2016 at 4:18 AM Jong Wook Kim  wrote:

> Hi, thanks for sharing the opinions,
>
> 1.
>
> It seems that GetRequest.storeLimit/Offset applies per column family and
> ColumnPaginationFilter can be applied across multiple column families.
>
> So I’m not sure if they are functionally equivalent, but if that’s what we
> want, I guess ColumnPaginationFilter should suffice.
>
>
> 2.
>
> You added an overloaded scanNextRows(Scanner, int) that differs from
> scanNextRows(Scanner) by only a single line.
>
> Like I did here <
> https://github.com/jongwook/incubator-s2graph/blob/feature/patched-asynchbase/s2core/src/main/scala/org/hbase/async/ScannerExtra.scala#L38-L40>,
> we can avoid touching HBaseClient.java by editing
> Scanner.getNextRowsRequest. So this should be simpler; please let me know
> if I’m missing something.
>
>
> JW
>
>
> > On Sep 28, 2016, at 10:41 AM, DO YUNG YOON  wrote:
> >
> > Hi Jong Wook.
> >
> > Thanks for casting this issue and working on it.
> > I took some time today, to clear the custom patches and find out
> followings.
> >
> > 1. GetRequest.setMaxResultsPerColumnFamily/setRowOffsetPerColumnFamily
> > methods.
> >
> > I think we do not actually need this method which only exist on our
> custom
> > patch.
> > Instead of using custom method on GetRequest, I think we can use
> FilterList.
> > Please check out this commit.
> >
> https://github.com/apache/incubator-s2graph/compare/master...SteamShon:asynchbase_path?expand=1
> >
> >
> > 2. Scanner.setTimeout.
> >
> > Please note that in
> >
> https://github.com/apache/incubator-s2graph/compare/master...SteamShon:asynchbase_path?expand=1
> ,
> > I commented out scanner.setTimeout.
> > Current upstream Asynchbase does not allow user to set timeout on
> scanner.
> > I am going to create issue and send following commits as a PR to
> upstream.
> > Please check out
> >
> https://github.com/OpenTSDB/asynchbase/compare/master...SteamShon:scanner_timeout?expand=1
> >
> > In my humble opinion, while we are waiting on issue on 2 discussed,
> merged,
> > and finally released by asynchbase guys, I think it is reasonable to use
> > your suggested patch for our first release.
> >
> >
> > On Wed, Sep 28, 2016 at 6:25 AM Jong Wook Kim  wrote:
> >
> >> I managed to created a version using ASM and ByteBuddy to do the
> >> patchwork:
> >>
> https://github.com/jongwook/incubator-s2graph/commit/770917e0c05228ba0855c16e417436bcc67ff412
> >> <
> >>
> https://github.com/jongwook/incubator-s2graph/commit/770917e0c05228ba0855c16e417436bcc67ff412
> >.
> >> It does:
> >>
> >> - Use ASM to remove the final modifiers of GetRequest and Scanner, and
> >> make all of their methods public
> >> - Dynamically create a subclass of GetRequest, that supports the
> >> limit/offset by adding the bean properties and intercepting serialize()
> >> method.
> >> - Dynamically create a subclass of Scanner, that adds a bean property
> “int
> >> rpcTimeout” and sets its value in the intercepted getNextRowsRequest().
> >>
> >> Now the modifications are only about 300 lines as opposed to the
> previous
> >> ones thousands.
> >>
> >> JW
> >>
> >>
> >>> On Sep 27, 2016, at 3:07 AM, Jong Wook Kim  wrote:
> >>>
> >>> That would be ideal, but asynchbase is being developed according to
> >> OpenTSDB’s own schedules that we don’t have control over.
> >>>
> >>> They don’t release snapshot artifacts and we needed to make a custom
> >> artifact anyway, and given the fact that their releases are usually only
> >> once a year I’m not sure if such PR can be merged in the foreseeable
> future.
> >>>
> >>> So this patch (as well as the custom artifact and all that jazz) is a
> >> temporary measure until there is the next Asynchbase release that
> contains
> >> the features we need.
> >>>
> >>> I remember that Doyoung made such PR sometime in the past, but I
> >> couldn’t locate it now.
> >>>
> >>>
> >>> JW
> >>>
>  On Sep 27, 2016, at 

Re: Including asynchbase in the sourcetree

2016-09-28 Thread DO YUNG YOON
Hi Jong Wook.

Thanks for casting this issue and working on it.
I took some time today, to clear the custom patches and find out followings.

1. GetRequest.setMaxResultsPerColumnFamily/setRowOffsetPerColumnFamily
methods.

I think we do not actually need this method which only exist on our custom
patch.
Instead of using custom method on GetRequest, I think we can use FilterList.
Please check out this commit.
https://github.com/apache/incubator-s2graph/compare/master...SteamShon:asynchbase_path?expand=1


2. Scanner.setTimeout.

Please note that in
https://github.com/apache/incubator-s2graph/compare/master...SteamShon:asynchbase_path?expand=1,
I commented out scanner.setTimeout.
Current upstream Asynchbase does not allow user to set timeout on scanner.
I am going to create issue and send following commits as a PR to upstream.
Please check out
https://github.com/OpenTSDB/asynchbase/compare/master...SteamShon:scanner_timeout?expand=1

In my humble opinion, while we are waiting on issue on 2 discussed, merged,
and finally released by asynchbase guys, I think it is reasonable to use
your suggested patch for our first release.


On Wed, Sep 28, 2016 at 6:25 AM Jong Wook Kim  wrote:

> I managed to created a version using ASM and ByteBuddy to do the
> patchwork:
> https://github.com/jongwook/incubator-s2graph/commit/770917e0c05228ba0855c16e417436bcc67ff412
> <
> https://github.com/jongwook/incubator-s2graph/commit/770917e0c05228ba0855c16e417436bcc67ff412>.
> It does:
>
> - Use ASM to remove the final modifiers of GetRequest and Scanner, and
> make all of their methods public
> - Dynamically create a subclass of GetRequest, that supports the
> limit/offset by adding the bean properties and intercepting serialize()
> method.
> - Dynamically create a subclass of Scanner, that adds a bean property “int
> rpcTimeout” and sets its value in the intercepted getNextRowsRequest().
>
> Now the modifications are only about 300 lines as opposed to the previous
> ones thousands.
>
> JW
>
>
> > On Sep 27, 2016, at 3:07 AM, Jong Wook Kim  wrote:
> >
> > That would be ideal, but asynchbase is being developed according to
> OpenTSDB’s own schedules that we don’t have control over.
> >
> > They don’t release snapshot artifacts and we needed to make a custom
> artifact anyway, and given the fact that their releases are usually only
> once a year I’m not sure if such PR can be merged in the foreseeable future.
> >
> > So this patch (as well as the custom artifact and all that jazz) is a
> temporary measure until there is the next Asynchbase release that contains
> the features we need.
> >
> > I remember that Doyoung made such PR sometime in the past, but I
> couldn’t locate it now.
> >
> >
> > JW
> >
> >> On Sep 27, 2016, at 2:57 AM, Sergio Fernández 
> wrote:
> >>
> >> As far as I understood, such changes affect mainly configuration
> (timeouts,
> >> limits, offsets, etc), right?
> >> Would not be better to submit a PR to the upstream project to allow to
> >> customize such configurations without modifying the code?
> >>
> >> On Tue, Sep 27, 2016 at 8:49 AM, Jong Wook Kim 
> wrote:
> >>
> >>> Related to the recent comments raised in the vote thread, I’d like to
> >>> revisit the asynchbase issue.
> >>>
> >>> Now their fixes on NSRE are tagged in the recently fixed 1.7.2 on Maven
> >>> Central, the remaining differences between our custom version and the
> >>> official version are:
> >>>
> >>> - RPC-wise timeout setting in Scanner
> >>> - limit and offset setting in GetRequest
> >>>
> >>> I made a small patch, as seen in here  >>> incubator-s2graph/commit/ad5c7f89e46ddbd5dfd9b8721737aa22f94b4002>,
> which
> >>> includes GetRequest.java and Scanner.java in the s2core tree along
> with a
> >>> utility that forces loading the bytecode from s2core’s classpath.
> >>>
> >>> Having two ~1000-line java files which are duplicates might be a bad
> >>> practice, but it eliminates the need to maintain a separate codebase
> and
> >>> maven repository for those small patches.
> >>>
> >>> To be more aesthetically satisfying, I’ve fiddled a little bit with
> >>> bytebuddy  to make runtime modification of
> the
> >>> behaviors of those classes, but the modifications are kindof scattered
> >>> making it harder to write the dynamic proxy.
> >>>
> >>> I’d appreciate any comments on the patch above.
> >>>
> >>> JW
> >>
> >>
> >>
> >>
> >> --
> >> Sergio Fernández
> >> Partner Technology Manager
> >> Redlink GmbH
> >> m: +43 6602747925
> >> e: sergio.fernan...@redlink.co
> >> w: http://redlink.co
> >
>
>


Re: Including asynchbase in the sourcetree

2016-09-27 Thread Jong Wook Kim
I managed to created a version using ASM and ByteBuddy to do the patchwork: 
https://github.com/jongwook/incubator-s2graph/commit/770917e0c05228ba0855c16e417436bcc67ff412
 
.
 It does:

- Use ASM to remove the final modifiers of GetRequest and Scanner, and make all 
of their methods public
- Dynamically create a subclass of GetRequest, that supports the limit/offset 
by adding the bean properties and intercepting serialize() method.
- Dynamically create a subclass of Scanner, that adds a bean property “int 
rpcTimeout” and sets its value in the intercepted getNextRowsRequest().

Now the modifications are only about 300 lines as opposed to the previous ones 
thousands.

JW


> On Sep 27, 2016, at 3:07 AM, Jong Wook Kim  wrote:
> 
> That would be ideal, but asynchbase is being developed according to 
> OpenTSDB’s own schedules that we don’t have control over.
> 
> They don’t release snapshot artifacts and we needed to make a custom artifact 
> anyway, and given the fact that their releases are usually only once a year 
> I’m not sure if such PR can be merged in the foreseeable future.
> 
> So this patch (as well as the custom artifact and all that jazz) is a 
> temporary measure until there is the next Asynchbase release that contains 
> the features we need.
> 
> I remember that Doyoung made such PR sometime in the past, but I couldn’t 
> locate it now.
> 
> 
> JW
> 
>> On Sep 27, 2016, at 2:57 AM, Sergio Fernández  wrote:
>> 
>> As far as I understood, such changes affect mainly configuration (timeouts,
>> limits, offsets, etc), right?
>> Would not be better to submit a PR to the upstream project to allow to
>> customize such configurations without modifying the code?
>> 
>> On Tue, Sep 27, 2016 at 8:49 AM, Jong Wook Kim  wrote:
>> 
>>> Related to the recent comments raised in the vote thread, I’d like to
>>> revisit the asynchbase issue.
>>> 
>>> Now their fixes on NSRE are tagged in the recently fixed 1.7.2 on Maven
>>> Central, the remaining differences between our custom version and the
>>> official version are:
>>> 
>>> - RPC-wise timeout setting in Scanner
>>> - limit and offset setting in GetRequest
>>> 
>>> I made a small patch, as seen in here >> incubator-s2graph/commit/ad5c7f89e46ddbd5dfd9b8721737aa22f94b4002>, which
>>> includes GetRequest.java and Scanner.java in the s2core tree along with a
>>> utility that forces loading the bytecode from s2core’s classpath.
>>> 
>>> Having two ~1000-line java files which are duplicates might be a bad
>>> practice, but it eliminates the need to maintain a separate codebase and
>>> maven repository for those small patches.
>>> 
>>> To be more aesthetically satisfying, I’ve fiddled a little bit with
>>> bytebuddy  to make runtime modification of the
>>> behaviors of those classes, but the modifications are kindof scattered
>>> making it harder to write the dynamic proxy.
>>> 
>>> I’d appreciate any comments on the patch above.
>>> 
>>> JW
>> 
>> 
>> 
>> 
>> -- 
>> Sergio Fernández
>> Partner Technology Manager
>> Redlink GmbH
>> m: +43 6602747925
>> e: sergio.fernan...@redlink.co
>> w: http://redlink.co
> 



Re: Including asynchbase in the sourcetree

2016-09-27 Thread Jong Wook Kim
That would be ideal, but asynchbase is being developed according to OpenTSDB’s 
own schedules that we don’t have control over.

They don’t release snapshot artifacts and we needed to make a custom artifact 
anyway, and given the fact that their releases are usually only once a year I’m 
not sure if such PR can be merged in the foreseeable future.

So this patch (as well as the custom artifact and all that jazz) is a temporary 
measure until there is the next Asynchbase release that contains the features 
we need.

I remember that Doyoung made such PR sometime in the past, but I couldn’t 
locate it now.


JW

> On Sep 27, 2016, at 2:57 AM, Sergio Fernández  wrote:
> 
> As far as I understood, such changes affect mainly configuration (timeouts,
> limits, offsets, etc), right?
> Would not be better to submit a PR to the upstream project to allow to
> customize such configurations without modifying the code?
> 
> On Tue, Sep 27, 2016 at 8:49 AM, Jong Wook Kim  wrote:
> 
>> Related to the recent comments raised in the vote thread, I’d like to
>> revisit the asynchbase issue.
>> 
>> Now their fixes on NSRE are tagged in the recently fixed 1.7.2 on Maven
>> Central, the remaining differences between our custom version and the
>> official version are:
>> 
>> - RPC-wise timeout setting in Scanner
>> - limit and offset setting in GetRequest
>> 
>> I made a small patch, as seen in here > incubator-s2graph/commit/ad5c7f89e46ddbd5dfd9b8721737aa22f94b4002>, which
>> includes GetRequest.java and Scanner.java in the s2core tree along with a
>> utility that forces loading the bytecode from s2core’s classpath.
>> 
>> Having two ~1000-line java files which are duplicates might be a bad
>> practice, but it eliminates the need to maintain a separate codebase and
>> maven repository for those small patches.
>> 
>> To be more aesthetically satisfying, I’ve fiddled a little bit with
>> bytebuddy  to make runtime modification of the
>> behaviors of those classes, but the modifications are kindof scattered
>> making it harder to write the dynamic proxy.
>> 
>> I’d appreciate any comments on the patch above.
>> 
>> JW
> 
> 
> 
> 
> -- 
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 6602747925
> e: sergio.fernan...@redlink.co
> w: http://redlink.co



Re: Including asynchbase in the sourcetree

2016-09-27 Thread Sergio Fernández
As far as I understood, such changes affect mainly configuration (timeouts,
limits, offsets, etc), right?
Would not be better to submit a PR to the upstream project to allow to
customize such configurations without modifying the code?

On Tue, Sep 27, 2016 at 8:49 AM, Jong Wook Kim  wrote:

> Related to the recent comments raised in the vote thread, I’d like to
> revisit the asynchbase issue.
>
> Now their fixes on NSRE are tagged in the recently fixed 1.7.2 on Maven
> Central, the remaining differences between our custom version and the
> official version are:
>
> - RPC-wise timeout setting in Scanner
> - limit and offset setting in GetRequest
>
> I made a small patch, as seen in here  incubator-s2graph/commit/ad5c7f89e46ddbd5dfd9b8721737aa22f94b4002>, which
> includes GetRequest.java and Scanner.java in the s2core tree along with a
> utility that forces loading the bytecode from s2core’s classpath.
>
> Having two ~1000-line java files which are duplicates might be a bad
> practice, but it eliminates the need to maintain a separate codebase and
> maven repository for those small patches.
>
> To be more aesthetically satisfying, I’ve fiddled a little bit with
> bytebuddy  to make runtime modification of the
> behaviors of those classes, but the modifications are kindof scattered
> making it harder to write the dynamic proxy.
>
> I’d appreciate any comments on the patch above.
>
> JW




-- 
Sergio Fernández
Partner Technology Manager
Redlink GmbH
m: +43 6602747925
e: sergio.fernan...@redlink.co
w: http://redlink.co