subject:"\[jira\] \[Commented\] \(HBASE\-5416\) Improve performance of scans with some kind of filters."

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-04-08 Thread Sergey Shelukhin (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625623#comment-13625623
]

Sergey Shelukhin commented on HBASE-5416:
-

what is this patch for in this JIRA? It's been closed months ago

Improve performance of scans with some kind of filters.
---

Key: HBASE-5416
URL: https://issues.apache.org/jira/browse/HBASE-5416
Project: HBase
Issue Type: Improvement
Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
Fix For: 0.94.5, 0.95.0

Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt,
5416-drop-new-method-from-filter.txt, 5416-Filtered_scans_v6.patch,
5416-TestJoinedScanners-0.94.txt, 5416-v13.patch, 5416-v14.patch,
5416-v15.patch, 5416-v16.patch, 5416-v5.txt, 5416-v6.txt,
Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch,
Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch,
Filtered_scans_v7.patch, HBASE-5416-v10.patch, HBASE-5416-v11.patch,
HBASE-5416-v12.patch, HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch,
HBASE-5416-v8.patch, HBASE-5416-v9.patch,
org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt

When the scan is performed, whole row is loaded into result list, after that
filter (if exists) is applied to detect that row is needed.
But when scan is performed on several CFs and filter checks only data from
the subset of these CFs, data from CFs, not checked by a filter is not needed
on a filter stage. Only when we decided to include current row. And in such
case we can significantly reduce amount of IO performed by a scan, by loading
only values, actually checked by a filter.
For example, we have two CFs: flags and snap. Flags is quite small (bunch of
megabytes) and is used to filter large entries from snap. Snap is very large
(10s of GB) and it is quite costly to scan it. If we needed only rows with
some flag specified, we use SingleColumnValueFilter to limit result to only
small subset of region. But current implementation is loading both CFs to
perform scan, when only small subset is needed.
Attached patch adds one routine to Filter interface to allow filter to
specify which CF is needed to it's operation. In HRegion, we separate all
scanners into two groups: needed for filter and the rest (joined). When new
row is considered, only needed data is loaded, filter applied, and only if
filter accepts the row, rest of data is loaded. At our data, this speeds up
such kind of scans 30-50 times. Also, this gives us the way to better
normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585131#comment-13585131
]

Ted Yu commented on HBASE-5416:
---

I am with Lars on this one. The feature should be part of 0.94

bq. Ted, I think your approach will just make things more complicated going
forward
Another option is to drop the new method from Filter interface. Server side
implementation depends on FilterBase which has the stub isFamilyEssential().
HRegion.RegionScannerImpl can use instanceof check which is fast.

Improve performance of scans with some kind of filters.
---

Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt,
5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch,
5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch,
Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch,
Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch,
HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch,
HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch,
HBASE-5416-v9.patch,
org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Dave Latham (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585148#comment-13585148
]

Dave Latham commented on HBASE-5416:

{quote}
How hard is it to change filter to use FilterBase and replace it first?
{quote}
The change is very simple for us. It means we need to a wait a bit before
deploying the hbase upgrade until we can upgrade our client apps first, though.
This is what we've decided to do, so this incompatibility is not going to be a
blocker for us, just a slight delay.

{quote}
I'd be interested in why you had to implement Filter directly rather than
extending FilterBase.
{quote}
This particular Filter implementation was made as a wrapper around any other
Filter as part of some experiments we were doing for more dynamic Filter
classloading a couple years back. I don't think there was a FilterBase class
at the time or we may have just chose to make it a generic Filter (or actually
RowFilterInterface back then) to make sure it implements and wraps every method.

I think leaving the method in FilterBase only for 0.94 would be a good move.
However, it's a bit tricky since 0.94.5 has already been released. If the
method is dropped from Filter in 0.94.6 then we're saying 0.94.6 is compatible
with everything but 0.94.5. However if you were unfortunate enough to start on
0.94.5 and implement Filter directly then you're going to break again. Perhaps
that's a rare enough case.

Improve performance of scans with some kind of filters.
---

Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt,
5416-drop-new-method-from-filter.txt, 5416-Filtered_scans_v6.patch,
5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 5416-v16.patch, 5416-v5.txt,
5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch,
Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch,
Filtered_scans_v5.patch, Filtered_scans_v7.patch, HBASE-5416-v10.patch,
HBASE-5416-v11.patch, HBASE-5416-v12.patch, HBASE-5416-v12.patch,
HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, HBASE-5416-v9.patch,
org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585151#comment-13585151
]

Ted Yu commented on HBASE-5416:
---

@Dave:
0.94.5 was announced on 2013-02-16. If 0.94.6 is released within 10 days, the
window of someone implementing 0.94.5 version of Filter interface is very short.

In hindsight, we should have implemented this feature in 0.94 without touching
Filter interface.
We have a good lesson (for other interfaces).

If you want to deploy 0.94.5 in the next few days, try not adding @Override in
your Filter implementation.

Again, thanks for reporting this - other HBase users would get benefit.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585167#comment-13585167
]

Ted Yu commented on HBASE-5416:
---

testScanner_JoinedScanners passed as well:
{code}
Running org.apache.hadoop.hbase.regionserver.TestHRegion
2013-02-23 09:07:06.614 java[57714:1203] Unable to load realm info from
SCDynamicStore
Tests run: 73, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.586 sec
{code}

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585211#comment-13585211
]

Lars Hofhansl commented on HBASE-5416:
--

Moving the method to FilterBase is a great idea and a good compromise.

Personally I think implementing Filters directly is rare and I would have a
preference for keeping this in the interface since it is cleaner.
isFamilyEssential is very useful for future scan performance enhancements, I
would hate to see it vanish again from the Filter interface. (Incidentally in
trunk Filter is now a class, which would have allowed us to make changes
without this problem).

As alternative can we add to the Javadoc of Filter a note to avoid implementing
it directly and rather extend FilterBase?

[~davelatham] If this is a hassle for you I think we're all in agreement that
we should push the method down to FilterBase.
[~stack] I think you'd prefer the push into FilterBase. Let's just do that.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Dave Latham (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585230#comment-13585230
]

Dave Latham commented on HBASE-5416:

It's not going to make a difference for me any longer as we're planning to move
forward with an application update then a 0.94.5 upgrade. However, it sounds
like a good plan to move to FilterBase (in the 0.94 branch only) to preserve
compatibility for anyone else who comes along.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585233#comment-13585233
]

stack commented on HBASE-5416:
--

bq. It's not going to make a difference for me any longer as we're planning to
move forward with an application update then a 0.94.5 upgrade.

So, you don't need us back anything out?

bq. What do you think of my proposal above (@23/Feb/13 06:11) ?

Can't find what you are referring to [~ted_yu]. If i search I only see the
above pointer.

bq. Dave Latham If this is a hassle for you I think we're all in agreement that
we should push the method down to FilterBase.

For 0.94? If it means a better compatibility story (with a hiccup, i.e. we
warn folks about prob. in 0.90.5 but its fixed in 0.90.6), then I'm for it.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585237#comment-13585237
]

Ted Yu commented on HBASE-5416:
---

bq. If i search I only see the above pointer.
I was referring to approach #1 which Lars said is too complicated.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585242#comment-13585242
]

Ted Yu commented on HBASE-5416:
---

Created HBASE-7920 to move the new method out of Filter interface.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Dave Latham (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585246#comment-13585246
]

Dave Latham commented on HBASE-5416:

{quote}So, you don't need us back anything out?{quote}
That's right, we're just going to work around it.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-23 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585281#comment-13585281
]

Lars Hofhansl commented on HBASE-5416:
--

It feels a bit like an overreaction. Not many folks implement their own
filters, of those not many implement Filter directly, there are workarounds,
and for Dave it no longer matters.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread Dave Latham (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584553#comment-13584553
]

Dave Latham commented on HBASE-5416:

I have a class that directly implements the Filter interface. This change
looks to me like it will prevent me from doing a rolling upgrade to 0.94.5 of
region servers while my client is using this filter on scans because the filter
will fail to implement the changed interface. Is that correct? Is that
acceptable?

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread Sergey Shelukhin (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584971#comment-13584971
]

Sergey Shelukhin commented on HBASE-5416:
-

Hmm, this is correct. I am not sure if this is acceptable, iirc I saw someone
pondering that (on the mailing list?) but deciding that most people would use
FilterBase, but I cannot find it now.
How hard is it to change filter to use FilterBase and replace it first?

[~lhofhansl] Do you have an opinion?

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585018#comment-13585018
]

stack commented on HBASE-5416:
--

[~lhofhansl] Would suggest backing out this change if it breaks compatibility,
especially if it breaks compatibility for our homies in SOMA, SF.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585021#comment-13585021
]

stack commented on HBASE-5416:
--

[~lhofhansl] Thanks for adding it to trunk. [~shmuma] Any chance of a
paragraph on your fancy new feature? If you draft it -- including the
possibilities your patch enables -- I'll take care of getting it into the ref
guide. Good stuff.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585041#comment-13585041
]

Ted Yu commented on HBASE-5416:
---

The 0.94 patch did introduce subtle issue.

But this feature is useful. See email thread entitled 'Co-Processor in scanning
the HBase's Table' on mailing list.

The cause seems to be the addition of a new method to Filter interface. Can we
do the following ?
1. introduce new interface, say Filter2 (open to other names), where
isFamilyEssential(byte[] name) is added
2. move isFamilyEssential(byte[] name) out of Filter interface
3. let FilterBase implement Filter2
4. declare filter field of RegionScannerImpl to be of type Filter2

Since 0.94.5 has been rolled out, it is another kind of regression if this
feature is taken out.

My two cents.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585048#comment-13585048
]

Lars Hofhansl commented on HBASE-5416:
--

IMHO this is similar to the coprocessor changes we had made in some 0.94 point
releases that also break coprocessors (unless they derive from classes like
BaseRegionObserver). In fact our own Phoenix folks ran into issues with this.

These are somewhat internal APIs and we should be able to change them...
Although I admit Filters are more stable in terms of APIs than coprocessors.
Still, I'd vote for keep this patch, unchanged.

[~davelatham], I'd be interested in why you had to implement Filter directly
rather than extending FilterBase.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585055#comment-13585055
]

stack commented on HBASE-5416:
--

There is no argument that this a 'useful' feature. 'useful' is not good enough
reason to break 'public' Interface. Why would we put any obstacle in the way
of the group that is running the largest hbase deploy? Don't they have enough
headache already w/o having to jump a gratuitous incompatibility hurdle Anyone
even 'need' this feature in 0.94? Suggest removing it for 0.90.6 so our man
Dave can just go there.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585058#comment-13585058
]

Ted Yu commented on HBASE-5416:
---

[~davelatham], [~stack]:
What do you think of my proposal above (@23/Feb/13 06:11) ?

Would that allow Dave to get over the hurdle ?

If you think so, I can open a new JIRA with a patch.

Thanks

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585062#comment-13585062
]

Lars Hofhansl commented on HBASE-5416:
--

A rolling upgrade is still possible if a stub isFamilyEssential(...) is added
to the Filter implementation before the rolling upgrade.

Anyway, I am not attached to this feature in 0.94.

At the same time I do not want to cripple our ability to make some changes to
these APIs. We have not frozen the coprocessor APIs and neither should we
freeze the Filter APIs.
What if somebody had implemented a coprocessor API that we had changed. In the
past we have stated that we will change these (coprocessor) APIs.

Ted, I think your approach will just make things more complicated going
forward. And I'd prefer to either keep this or revert altogether.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-22 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585064#comment-13585064
]

Lars Hofhansl commented on HBASE-5416:
--

One last comment :)

The reason why I am arguing keeping this is that this is one of the few
features that allows HBase to make use to of its columnar nature to speed up
queries.
HBase is not known for its scan performance and this is one features to point
to where we allow HBase to not even look at another column family unless a
filter is matched for potentially significant speedups. I was planning on
extending this to other filters as well.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-02-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570945#comment-13570945
 ] 

Hudson commented on HBASE-5416:
---

Integrated in HBase-0.94-security-on-Hadoop-23 #11 (See 
[https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/11/])
HBASE-5416 Improve performance of scans with some kind of filters. (Sergey 
Shelukhin) (Revision 1433195)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/Filter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueExcludeFilter.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0, 0.94.5

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch, 
 org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553712#comment-13553712
 ] 

Hudson commented on HBASE-5416:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #348 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/348/])
HBASE-7383 create integration test for HBASE-5416 (improving scan 
performance for certain filters) (Sergey) (Revision 1433224)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/test/LoadTestDataGenerator.java
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/test/LoadTestKVGenerator.java
* 
/hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedAction.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java


 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0, 0.94.5

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch, 
 org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554223#comment-13554223
 ] 

Hudson commented on HBASE-5416:
---

Integrated in HBase-0.94-security #95 (See 
[https://builds.apache.org/job/HBase-0.94-security/95/])
HBASE-5416 Improve performance of scans with some kind of filters. (Sergey 
Shelukhin) (Revision 1433195)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/Filter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueExcludeFilter.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0, 0.94.5

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch, 
 org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-14 Thread Sergey Shelukhin (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553167#comment-13553167
]

Sergey Shelukhin commented on HBASE-5416:
-

Can this be counted as +1 to commit? :)

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-14 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553221#comment-13553221
]

Lars Hofhansl commented on HBASE-5416:
--

Yes. I will commit this today.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553329#comment-13553329
 ] 

Hudson commented on HBASE-5416:
---

Integrated in HBase-0.94 #732 (See 
[https://builds.apache.org/job/HBase-0.94/732/])
HBASE-5416 Improve performance of scans with some kind of filters. (Sergey 
Shelukhin) (Revision 1433195)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/Filter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueExcludeFilter.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0, 0.94.5

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch, 
 org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553406#comment-13553406
 ] 

Hudson commented on HBASE-5416:
---

Integrated in HBase-TRUNK #3745 (See 
[https://builds.apache.org/job/HBase-TRUNK/3745/])
HBASE-7383 create integration test for HBASE-5416 (improving scan 
performance for certain filters) (Sergey) (Revision 1433224)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/test/LoadTestDataGenerator.java
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/test/LoadTestKVGenerator.java
* 
/hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestLoadTestKVGenerator.java
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedAction.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java


 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0, 0.94.5

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 5416-0.94-v3.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch, 
 org.apache.hadoop.hbase.regionserver.TestHRegion-output.txt


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-11 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551652#comment-13551652
]

Lars Hofhansl commented on HBASE-5416:
--

I think I convinced myself that this is good to go for 0.94.

Going forward this could be useful for all kinds of filters. I can see many
scenarios where we want filters to be evaluated on selected CFs only and
include the other CFs when the row is not filtered based on the former.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Sergey Shelukhin (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549955#comment-13549955
]

Sergey Shelukhin commented on HBASE-5416:
-

Is this JIRA unresolved pending 0.94 commit? Just checking as it shows up in my
filter :)

Improve performance of scans with some kind of filters.
---

Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt,
5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch,
5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch,
Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch,
Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch,
HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch,
HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch,
HBASE-5416-v9.patch

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550431#comment-13550431
 ] 

Ted Yu commented on HBASE-5416:
---

For 0.94 patch, I saw the following on my Mac:
{code}
testScanner_JoinedScannersWithLimits(org.apache.hadoop.hbase.regionserver.TestHRegion)
  Time elapsed: 0.001 sec   FAILURE!
junit.framework.AssertionFailedError: expected:3 but was:1
  at junit.framework.Assert.fail(Assert.java:50)
  at junit.framework.Assert.failNotEquals(Assert.java:287)
  at junit.framework.Assert.assertEquals(Assert.java:67)
  at junit.framework.Assert.assertEquals(Assert.java:199)
  at junit.framework.Assert.assertEquals(Assert.java:205)
  at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testScanner_JoinedScannersWithLimits(TestHRegion.java:2976)
{code}

 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550680#comment-13550680
]

Lars Hofhansl commented on HBASE-5416:
--

I ran all 0.94 tests. They all pass on my machines.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550731#comment-13550731
]

Lars Hofhansl commented on HBASE-5416:
--

Does this test fail consistently for you?

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550737#comment-13550737
]

Lars Hofhansl commented on HBASE-5416:
--

On my machine at this test fails too in 0.94.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550772#comment-13550772
]

Lars Hofhansl commented on HBASE-5416:
--

Found the problem. For the part of the patch that I had applied manually I
mistook{{kv != KV_LIMIT}} for {{kv == KV_LIMIT}}.
No idea how on earth the test on my server machine passed.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550794#comment-13550794
]

Ted Yu commented on HBASE-5416:
---

Thanks Lars for the finding. I am running test suite based on patch v3.
Will report back if there is any abnormality.
I was looking for long lines.
{code}
+ public static final String LOAD_CFS_ON_DEMAND_CONFIG_KEY =
hbase.hregion.scan.loadColumnFamiliesOnDemand;
{code}
nit: wrap long line above.
{code}
+ * @param heap KeyValueHeap to fetch data from. It must be positioned on
correct row before call.
{code}
Long line: it would be 100 characters wide if the trailing period is removed.
{code}
+ stopRow = nextKv == null || isStopRow(nextKv.getBuffer(),
nextKv.getRowOffset(), nextKv.getRowLength());
{code}

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-10 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550820#comment-13550820
]

Ted Yu commented on HBASE-5416:
---

The following tests failed locally:
TestMultiSlaveReplication,TestMasterReplication,TestZKLeaderManager

The first two failed without the patch, too.

I think patch v3 should be good to go.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Karthik Ranganathan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548688#comment-13548688
]

Karthik Ranganathan commented on HBASE-5416:

I think the specific description (of making filters apply to only some CF's) is
a good idea.But we continue down this path of generalizing filters, it could
lead to an explosion of ad-hoc filters. In that case, it might be better to
expose more co-processor hooks. Overall, +1 (only skimmed the changes though).

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548713#comment-13548713
]

Ted Yu commented on HBASE-5416:
---

Thanks for the review, Karthik.
I will think about how co-processor hooks can be used to reduce changes in
filters.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549006#comment-13549006
]

Lars Hofhansl commented on HBASE-5416:
--

I don't necessarily think that ad hoc filters are bad. They are nice in that
they are per store, can do skip scans, etc. They fill a different use case
compare to coprocs.
If anything, this might be an impetus to support filters better (load them
dynamically like coprocs, maybe even invent a general filter descriptions, etc,
etc).

Since nobody has better API ideas I'm +1 on committing (both 0.94 and 0.96).

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549012#comment-13549012
]

Ted Yu commented on HBASE-5416:
---

We have 4 +1's for this JIRA.
It is time to integrate.
I plan to do that in trunk by this evening.

@Lars:
I haven't run test suite for 0.94 patch, do you want to integrate to 0.94
branch ?

Thanks

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549040#comment-13549040
]

Ted Yu commented on HBASE-5416:
---

Integrated to trunk.

Thanks for the patch, Max and Sergey.

Thanks for the review, Stack, Lars, Ram and Karthik.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549091#comment-13549091
 ] 

Hudson commented on HBASE-5416:
---

Integrated in HBase-TRUNK #3716 (See 
[https://builds.apache.org/job/HBase-TRUNK/3716/])
HBASE-5416 Improve performance of scans with some kind of filters (Max 
Lapan and Sergey) (Revision 1431103)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* /hbase/trunk/hbase-protocol/src/main/protobuf/Client.proto
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/Filter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/FilterWrapper.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueExcludeFilter.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestJoinedScanners.java


 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549208#comment-13549208
]

Lars Hofhansl commented on HBASE-5416:
--

Looks like the trunk patch was not changed since I made the 0.94 patch (except
for wrapping the long line, etc).
I'll commit in the next day or so (unless somebody objects)

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549238#comment-13549238
 ] 

Hudson commented on HBASE-5416:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #338 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/338/])
HBASE-5416 Improve performance of scans with some kind of filters (Max 
Lapan and Sergey) (Revision 1431103)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* /hbase/trunk/hbase-protocol/src/main/protobuf/Client.proto
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/Filter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/FilterWrapper.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestSingleColumnValueExcludeFilter.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestJoinedScanners.java


 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
 Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
 Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
 HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
 HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
 HBASE-5416-v9.patch


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-08 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547076#comment-13547076
]

Ted Yu commented on HBASE-5416:
---

[~mikhail], [~karthik.ranga], [~kannanm]:
Your opinion would be helpful.

Thanks

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-07 Thread Sergey Shelukhin (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546430#comment-13546430
]

Sergey Shelukhin commented on HBASE-5416:
-

Btw, the test appears to pass.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-05 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544708#comment-13544708
]

Ted Yu commented on HBASE-5416:
---

Thanks for the review, Ram.

+1 from me too.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-05 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544746#comment-13544746
]

Hadoop QA commented on HBASE-5416:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12563428/5416-v16.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 9 new
or modified tests.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:red}-1 lineLengths{color}. The patch introduces lines longer than
100

{color:red}-1 core tests{color}. The patch failed these unit tests:
org.apache.hadoop.hbase.regionserver.TestSplitTransaction

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/3876//console

This message is automatically generated.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-05 Thread ramkrishna.s.vasudevan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544755#comment-13544755
]

ramkrishna.s.vasudevan commented on HBASE-5416:
---

+1 from me.

Improve performance of scans with some kind of filters.
---

[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-05 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544773#comment-13544773
]

Hadoop QA commented on HBASE-5416:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12563434/5416-v16.patch
against trunk revision .