I discovered what appears to be a bug while compiling on Windows.
Line 46 of NiFiGroovyTest.groovy should be
private static final String TEST_RES_PATH =
Paths.get(NiFiGroovyTest.getClassLoader().getResource(".").toURI()).toString()
Instead of
private static final String TEST_RES_PATH =
NiFiGroovyTest.getClassLoader().getResource(".").toURI().getPath()
See posts like
https://stackoverflow.com/questions/43972777/exception-in-thread-main-java-nio-file-invalidpathexception-illegal-char
for an explanation.
Thanks
Shawn Weeks
-----Original Message-----
From: Shawn Weeks <[email protected]>
Sent: Saturday, May 4, 2019 9:07 AM
To: [email protected]
Subject: RE: Adding HBase Support for AtomicDistributedMapCacheClient
I've created Pull Request https://github.com/apache/nifi/pull/3462 for this
change. I'm still doing some testing and it might not actually work right but I
wanted some other folks to be able to see it. If anyone knows how to do include
timestamp in a checkAndPut for HBase 1.x let me know and I'll implement it.
Thanks
Shawn Weeks
-----Original Message-----
From: Bryan Bende <[email protected]>
Sent: Thursday, April 25, 2019 7:05 PM
To: [email protected]
Subject: Re: Adding HBase Support for AtomicDistributedMapCacheClient
Should be available through the existing scan methods, they take a
ResultHandler which gets passed an array of ResultCells, and each one has the
timestamp.
> On Apr 25, 2019, at 7:52 PM, Shawn Weeks <[email protected]> wrote:
>
> I haven't looked at the other side of equation yet and that's how to get the
> timestamp on fetch. That will probably require a change or new scan method.
>
> Thanks
> Shawn
>
> -----Original Message-----
> From: Bryan Bende <[email protected]>
> Sent: Thursday, April 25, 2019 4:29 PM
> To: [email protected]
> Subject: Re: Adding HBase Support for AtomicDistributedMapCacheClient
>
> Also just realized that we do have two versions of the HBase DMC client
> service, so they could each do different things.
>
> The HBase_1_1_2_ClientMapCacheService could call the original checkAndPut,
> and the HBase_2_x_ClientMapCacheService could call the method.
>
> In this approach the 1_1_2 client service could throw unsupported for the new
> method since it would never be used.
>
> On Thu, Apr 25, 2019 at 5:25 PM Bryan Bende <[email protected]> wrote:
>>
>> Thanks, I'm following now...
>>
>> I think adding the new method to the interface and throwing
>> UnsupportedOperationException for 1_1_2, or using the original
>> checkAndPut and implementing it in both services, would both be fine
>> solutions.
>>
>> I guess another variation might be to introduce the new method in the
>> interface, but in the 1_1_2 implementation just delegate back to the
>> original checkAndPut and ignore the timestamp, and document that it
>> isn't used in that implementation. I don't love this, but it does
>> allow both services to implement the functionality and still leverage
>> the better solution for 2_x.
>>
>>
>> On Thu, Apr 25, 2019 at 3:54 PM Shawn Weeks <[email protected]>
>> wrote:
>>>
>>> Here is what I think the new checkAndPut or checkAndMutate method would
>>> look like. This also shows what the new mutate api looks like.
>>>
>>> @Override
>>> public boolean checkAndPut(String tableName, byte[] rowId, byte[]
>>> family, byte[] qualifier, byte[] value, long timestamp, PutColumn column)
>>> throws IOException {
>>> try (final Table table =
>>> connection.getTable(TableName.valueOf(tableName))) {
>>> Put put = new Put(rowId);
>>> put.addColumn(
>>> column.getColumnFamily(),
>>> column.getColumnQualifier(),
>>> column.getBuffer());
>>> return table.checkAndMutate(rowId,
>>> family).qualifier(qualifier).ifEquals(value).timeRange(TimeRange.at(timestamp)).thenPut(put);
>>> }
>>> }
>>>
>>> If the atomic guarantee for the original checkAndPut is good enough then
>>> there is no reason I can't implement the atomic map cache for both versions
>>> of HBase.
>>>
>>> Thanks
>>> Shawn
>>>
>>> -----Original Message-----
>>> From: Bryan Bende <[email protected]>
>>> Sent: Thursday, April 25, 2019 12:39 PM
>>> To: [email protected]
>>> Subject: Re: Adding HBase Support for
>>> AtomicDistributedMapCacheClient
>>>
>>> I'm not totally if would matter if there were changes in between, as long
>>> as the current value is what we thought it was then the changes we are
>>> sending back should be accurate as a replacement. As a simplified scenario,
>>> if the current value is 1 and thread-A retrieves that value, thread-B then
>>> changes it to 2 and back to 1 before thread-A can do anything, then
>>> thread-A sends in 2 with a previous of 1, that is still the correct
>>> replacement.
>>>
>>> I can see the argument for using the timestamp though... can you show the
>>> method signature of the new checkAndMutate method that would need to be
>>> added to the client service, and also which method of the HBase client it
>>> needs to call?
>>>
>>> Just so I can get an idea of the differences between 1.x and 2.x.
>>>
>>> On Thu, Apr 25, 2019 at 1:00 PM Shawn Weeks <[email protected]>
>>> wrote:
>>>>
>>>> While checkAndPut is atomic as it's built now it doesn't support also
>>>> checking the timestamp range which is included in the new checkAndMutate
>>>> API. I had planned on using the cell's timestamp as the revision along
>>>> with the value to ensure not only that the value hadn't been changed but
>>>> that there hadn't been changes in between that just happened to put the
>>>> value back.
>>>>
>>>> As I was looking at everything I had another question. Why is the cache
>>>> currently using a scan instead of a get to fetch values from HBase. It
>>>> seems like that would be much less performant considering we know the row
>>>> key we're looking for.
>>>>
>>>>
>>>> Thanks
>>>> Shawn
>>>>
>>>> -----Original Message-----
>>>> From: Bryan Bende <[email protected]>
>>>> Sent: Thursday, April 25, 2019 11:56 AM
>>>> To: [email protected]
>>>> Subject: Re: Adding HBase Support for
>>>> AtomicDistributedMapCacheClient
>>>>
>>>> Can it not be done with the existing checkAndPut method? [1]
>>>>
>>>> I think if you use the value as the revision it should work. Would be
>>>> similar to how the Redis implementation works [2].
>>>>
>>>> [1]
>>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-s
>>>> tand
>>>> ard-services/nifi-hbase-client-service-api/src/main/java/org/apach
>>>> e/ni
>>>> fi/hbase/HBaseClientService.java#L65
>>>> [2]
>>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-r
>>>> edis
>>>> -bundle/nifi-redis-extensions/src/main/java/org/apache/nifi/redis/
>>>> serv
>>>> ice/RedisDistributedMapCacheClientService.java#L271
>>>>
>>>> On Thu, Apr 25, 2019 at 12:38 PM Shawn Weeks <[email protected]>
>>>> wrote:
>>>>>
>>>>> I'll need to add a check and mutate method to the HBaseClientService
>>>>> Interface, should I just extend with a HBase2ClientService or add
>>>>> checkAndMutate to the existing interface and just make it raise an
>>>>> exception if you try and use it against hbase 1? While Hbase 1.x supports
>>>>> checkAndMutate it doesn't provide a way to filter on timestamp which is
>>>>> part of how I was going to implement the revision requirement for
>>>>> AtomicMapCache.
>>>>>
>>>>> Thanks
>>>>> Shawn
>>>>>
>>>>> -----Original Message-----
>>>>> From: Bryan Bende <[email protected]>
>>>>> Sent: Thursday, April 25, 2019 9:11 AM
>>>>> To: [email protected]
>>>>> Subject: Re: Adding HBase Support for
>>>>> AtomicDistributedMapCacheClient
>>>>>
>>>>> I'm not aware of a JIRA, so I'd say go for it.
>>>>>
>>>>> On Wed, Apr 24, 2019 at 9:27 PM Shawn Weeks <[email protected]>
>>>>> wrote:
>>>>>>
>>>>>> Seems like this should be fairly easy for HBase 2.x with the
>>>>>> checkAndMutate functionality and I was wondering if there is already a
>>>>>> Jira for this. Otherwise I might make an attempt at it. It would be good
>>>>>> to be able to support Wait/Notify and other things that need
>>>>>> AtomicDistributedMapCacheClient using an Apache developed product
>>>>>> commonly found in a Hadoop Cluster.
>>>>>>
>>>>>> Thanks
>>>>>> Shawn