[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2017-04-05 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-6618:
---
Fix Version/s: (was: 1.4.0)
   1.5.0

> Implement FuzzyRowFilter with ranges support
> 
>
> Key: HBASE-6618
> URL: https://issues.apache.org/jira/browse/HBASE-6618
> Project: HBase
>  Issue Type: New Feature
>  Components: Filters
>Reporter: Alex Baranau
>Assignee: Alex Baranau
>Priority: Minor
> Fix For: 2.0.0, 1.5.0
>
> Attachments: HBASE-6618_2.path, HBASE-6618_3.path, 
> HBASE-6618_4.patch, HBASE-6618_5.patch, HBASE-6618-algo-desc-bits.png, 
> HBASE-6618-algo.patch, HBASE-6618.patch
>
>
> Apart from current ability to specify fuzzy row filter e.g. for 
>  format as _0004 (where 0004 - actionId) it would be 
> great to also have ability to specify the "fuzzy range" , e.g. _0004, 
> ..., _0099.
> See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
> Note: currently it is possible to provide multiple fuzzy row rules to 
> existing FuzzyRowFilter, but in case when the range is big (contains 
> thousands of values) it is not efficient.
> Filter should perform efficient fast-forwarding during the scan (this is what 
> distinguishes it from regex row filter).
> While such functionality may seem like a proper fit for custom filter (i.e. 
> not including into standard filter set) it looks like the filter may be very 
> re-useable. We may judge based on the implementation that will hopefully be 
> added.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2016-06-17 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-6618:
---
Fix Version/s: (was: 1.3.0)
   1.4.0

> Implement FuzzyRowFilter with ranges support
> 
>
> Key: HBASE-6618
> URL: https://issues.apache.org/jira/browse/HBASE-6618
> Project: HBase
>  Issue Type: New Feature
>  Components: Filters
>Reporter: Alex Baranau
>Assignee: Alex Baranau
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
> HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch, 
> HBASE-6618_5.patch
>
>
> Apart from current ability to specify fuzzy row filter e.g. for 
>  format as _0004 (where 0004 - actionId) it would be 
> great to also have ability to specify the "fuzzy range" , e.g. _0004, 
> ..., _0099.
> See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
> Note: currently it is possible to provide multiple fuzzy row rules to 
> existing FuzzyRowFilter, but in case when the range is big (contains 
> thousands of values) it is not efficient.
> Filter should perform efficient fast-forwarding during the scan (this is what 
> distinguishes it from regex row filter).
> While such functionality may seem like a proper fit for custom filter (i.e. 
> not including into standard filter set) it looks like the filter may be very 
> re-useable. We may judge based on the implementation that will hopefully be 
> added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2015-06-22 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-6618:
---
Fix Version/s: (was: 1.2.0)
   1.3.0
   Status: Open  (was: Patch Available)

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 2.0.0, 1.3.0

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch, 
 HBASE-6618_5.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2015-04-27 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-6618:

Fix Version/s: (was: 1.1.0)
   1.2.0
   2.0.0

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch, 
 HBASE-6618_5.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2014-12-17 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6618:
-
Fix Version/s: (was: 1.0.0)
   1.1.0

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 1.1.0

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch, 
 HBASE-6618_5.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2014-12-02 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6618:
-
Fix Version/s: (was: 0.99.2)
   1.0.0

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 1.0.0

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch, 
 HBASE-6618_5.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2014-10-11 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6618:
-
Fix Version/s: (was: 0.99.1)
   0.99.2

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 0.99.2

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch, 
 HBASE-6618_5.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2014-09-09 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6618:
-
Fix Version/s: (was: 0.99.0)
   0.99.1

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 0.99.1

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch, 
 HBASE-6618_5.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2014-04-05 Thread Alex Baranau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Baranau updated HBASE-6618:


Attachment: HBASE-6618_5.patch

addressed review comments

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch, 
 HBASE-6618_5.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2014-04-04 Thread Alex Baranau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Baranau updated HBASE-6618:


Attachment: HBASE-6618_4.patch

Updated patch to fit latest trunk

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2014-04-04 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6618:
--

Status: Patch Available  (was: Open)

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path, HBASE-6618_4.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2013-12-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6618:
--

Fix Version/s: (was: 0.98.0)
   0.99.0
   Status: Open  (was: Patch Available)

Stale issue

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch, HBASE-6618_2.path, HBASE-6618_3.path


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2013-07-02 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6618:
--

Fix Version/s: 0.98.0

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Fix For: 0.98.0

 Attachments: HBASE-6618_2.path, HBASE-6618_3.path, 
 HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, HBASE-6618.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2013-01-03 Thread Alex Baranau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Baranau updated HBASE-6618:


Attachment: HBASE-6618_3.path

Fixed line length in patch (one was 100).

Not sure what about those tests: they work well for me locally.

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Attachments: HBASE-6618_2.path, HBASE-6618_3.path, 
 HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, HBASE-6618.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2012-12-31 Thread Alex Baranau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Baranau updated HBASE-6618:


Attachment: HBASE-6618.patch

Looks like Anil didn't find time for that in the end.

But I believe that this functionality is very very useful. Will work on it then.

Attached patch with implemented filter functionality. I have doubts about 
filter API (i.e. defining fuzzy rules with Triplebyte[], byte[], byte[]), any 
suggestions are very welcome!

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Priority: Minor
 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2012-12-31 Thread Alex Baranau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Baranau updated HBASE-6618:


Assignee: Alex Baranau

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: Filters
Reporter: Alex Baranau
Assignee: Alex Baranau
Priority: Minor
 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch, 
 HBASE-6618.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6618) Implement FuzzyRowFilter with ranges support

2012-08-22 Thread Alex Baranau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Baranau updated HBASE-6618:


Attachment: HBASE-6618-algo-desc-bits.png
HBASE-6618-algo.patch

Anil,

Now that I thought about it I just realized that finding the row key to 
fast-forward to, when given any number of range groups in the fuzzy rules is 
quite easy. And also can be done in just *one pass*, by going through the bytes 
of the given row.

Didn't have much time to add this functionality to the filter itself, but 
implemented the algorithm that seems to find the row key to fast-forward to (if 
you are interested to look at it). Added static method for that with small (not 
full) unit-test. Also attached brief description of the algo. I hope I'm not 
missing anything.

Will implement the new feature of the filter as a next step.

 Implement FuzzyRowFilter with ranges support
 

 Key: HBASE-6618
 URL: https://issues.apache.org/jira/browse/HBASE-6618
 Project: HBase
  Issue Type: New Feature
  Components: filters
Reporter: Alex Baranau
Priority: Minor
 Attachments: HBASE-6618-algo-desc-bits.png, HBASE-6618-algo.patch


 Apart from current ability to specify fuzzy row filter e.g. for 
 userId_actionId format as _0004 (where 0004 - actionId) it would be 
 great to also have ability to specify the fuzzy range , e.g. _0004, 
 ..., _0099.
 See initial discussion here: http://search-hadoop.com/m/WVLJdX0Z65
 Note: currently it is possible to provide multiple fuzzy row rules to 
 existing FuzzyRowFilter, but in case when the range is big (contains 
 thousands of values) it is not efficient.
 Filter should perform efficient fast-forwarding during the scan (this is what 
 distinguishes it from regex row filter).
 While such functionality may seem like a proper fit for custom filter (i.e. 
 not including into standard filter set) it looks like the filter may be very 
 re-useable. We may judge based on the implementation that will hopefully be 
 added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira