[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query (using a KeyRange)

2011-08-30 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094036#comment-13094036
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 8/30/11 8:02 PM:
---

Something broke here in production once we went out with 0.8.2. It may have 
been some poor testing, i'm not entirely sure and a little surprised.

CFIF:135 breaks because inside {{dhtRange.intersects(jobRange)}} there's a call 
to {{new Range(token, token)}} which calls {{StorageService.getPartitioner()}} 
and StorageService is null as we're not inside the server. 

A quick fix is to change Range:148 from {{new Range(token, token)}} to {{new 
Range(token, token, partitioner)}} making the presumption that the partitioner 
for the new Range will be the same as this Range.


  was (Author: michaelsembwever):
Something broke here in production once we went out with 0.8.2. It may have 
been some poor testing, i'm not entirely sure and a little surprised.

CFIF:135 breaks because inside {{dhtRange.intersects(jobRange)}} there's a call 
to {{new Range(token, token)}} which calls {{StorageService.getPartitioner()}} 
and StorageService is null as we're not inside the server. 

A quick fix (tested) is to change Range:148 from {{new Range(token, token)}} to 
{{new Range(token, token, partitioner)}} making the presumption that the 
partitioner for the new Range will be the same as this Range.

  
 Filter out ColumnFamily rows that aren't part of the query (using a KeyRange)
 -

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 0.8.2

 Attachments: 1125-formatted.txt, 1125-v3.txt, CASSANDRA-1125.patch, 
 CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query (using a KeyRange)

2011-08-30 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094036#comment-13094036
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 8/30/11 8:55 PM:
---

Something broke here in production once we went out with 0.8.2. It may have 
been some poor testing, i'm not entirely sure and a little surprised.

CFIF:135 breaks because inside {{dhtRange.intersects(jobRange)}} there's a call 
to {{new Range(token, token)}} which calls {{StorageService.getPartitioner()}} 
and StorageService is null as we're not inside the server. 

A quick fix is to change Range:148 from {{new Range(token, token)}} to {{new 
Range(token, token, partitioner)}} making the presumption that the partitioner 
for the new Range will be the same as this Range. This won't work if the Range 
wraps in any way (which could be just a limitation of the current KeyRange 
filtering), but otherwise tests ok.


  was (Author: michaelsembwever):
Something broke here in production once we went out with 0.8.2. It may have 
been some poor testing, i'm not entirely sure and a little surprised.

CFIF:135 breaks because inside {{dhtRange.intersects(jobRange)}} there's a call 
to {{new Range(token, token)}} which calls {{StorageService.getPartitioner()}} 
and StorageService is null as we're not inside the server. 

A quick fix is to change Range:148 from {{new Range(token, token)}} to {{new 
Range(token, token, partitioner)}} making the presumption that the partitioner 
for the new Range will be the same as this Range.

  
 Filter out ColumnFamily rows that aren't part of the query (using a KeyRange)
 -

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 0.8.2

 Attachments: 1125-formatted.txt, 1125-v3.txt, CASSANDRA-1125.patch, 
 CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-27 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055503#comment-13055503
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/27/11 9:31 PM:
---

can this go into 0.8.1 ?
( and can we split this issue into two: 1) for KeyRange and 2) for IndexClause )


  was (Author: michaelsembwever):
can this go into 0.8.1 ?
  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0

 Attachments: CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-23 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/23/11 8:01 AM:
---

For now (without CASSANDRA-1600) I can use a {{KeyRange}} and 
{{Range.intersectionWith(..)}} for start/end rowKey limits in CFIF.

To upgrade from KeyRange to IndexClause (once it contains an optional KeyRange 
field) can be easily enough done latter by replacing 
ConfigHelper.setInputKeyRange(..) to ConfigHelper.setInputIndexClause(..) and 
rewriting the two lines of code in CFRR's RowIterator.maybeInit(..) 


  was (Author: michaelsembwever):
For now (without CASSANDRA-1600) I can use a {{KeyRange}} and 
{{Range.intersectionWith(..)}} for start/end rowKey limits in CFIF.

To upgrade from KeyRange to IndexClause (once it contains an optional KeyRange 
field) can be easily enough done latter by replacing 
ConfigHelper.setInputKeyRange(..) to ConfigHelper.setInputIndexClause(..) and 
rewriting the code two lines of code in CFRR's RowIterator.maybeInit(..) 

  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-23 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/23/11 8:00 AM:
---

For now (without CASSANDRA-1600) I can use a {{KeyRange}} and 
{{Range.intersectionWith(..)}} for start/end rowKey limits in CFIF.

To upgrade from KeyRange to IndexClause (once it contains an optional KeyRange 
field) can be easily enough done latter by replacing 
ConfigHelper.setInputKeyRange(..) to ConfigHelper.setInputIndexClause(..) and 
rewriting the code two lines of code in CFRR's RowIterator.maybeInit(..) 


  was (Author: michaelsembwever):
I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little early here about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible? (it 
needs to to pass through the batch's range)


  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/22/11 9:08 PM:
---

I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little early here about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible?



  was (Author: michaelsembwever):
I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little earlier about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible?


  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/22/11 9:08 PM:
---

I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little earlier about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible?



  was (Author: michaelsembwever):
I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
limits in CFIF.

And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}}

But both approaches can't be combined. So i guess ConfigHelper could have 
methods setInputKeyRange(..) and setInputIndexClause(..) which are mutually 
exclusive to call.


  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/22/11 9:10 PM:
---

I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little early here about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible? (it 
needs to to pass through the batch's range)



  was (Author: michaelsembwever):
I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little early here about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible?


  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Issue Comment Edited: (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-01-03 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976876#action_12976876
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 1/3/11 1:58 PM:
--

Jonathan: do you mean the IndexExpression and IndexClause and 
Table.open(Keyspace1).getColumnFamilyStore(Indexed1).scan(clause, filter);
 being used inside of ColumnFamilyRecordReader.maybeInit() ??


  was (Author: michaelsembwever):
Jonathan: do you mean the IndexExpression and IndexClause and 
Table.open(Keyspace1).getColumnFamilyStore(Indexed1).scan(clause, filter);
 inside of ColumnFamilyRecordReader.maybeInit() ??

  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 0.7.1


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-01-03 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976876#action_12976876
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 1/3/11 2:01 PM:
--

Jonathan: do you mean the IndexExpression and IndexClause and 
Table.open(Keyspace1).getColumnFamilyStore(Indexed1).scan(clause, filter);
 being used, instead of the KeyRange, inside of 
ColumnFamilyRecordReader.maybeInit() ??


  was (Author: michaelsembwever):
Jonathan: do you mean the IndexExpression and IndexClause and 
Table.open(Keyspace1).getColumnFamilyStore(Indexed1).scan(clause, filter);
 being used inside of ColumnFamilyRecordReader.maybeInit() ??

  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 0.7.1


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.