[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query (using a KeyRange)

2011-07-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064118#comment-13064118
 ] 

Hudson commented on CASSANDRA-1125:
---

Integrated in Cassandra-0.8 #214 (See 
[https://builds.apache.org/job/Cassandra-0.8/214/])
add KeyRangeoption to Hadoop inputformat
patch by Mck SembWever; reviewed by jbellis for CASSANDRA-1125

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1145731
Files : 
* /cassandra/branches/cassandra-0.8/CHANGES.txt
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ConfigHelper.java
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java


 Filter out ColumnFamily rows that aren't part of the query (using a KeyRange)
 -

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 0.8.2

 Attachments: 1125-formatted.txt, 1125-v3.txt, CASSANDRA-1125.patch, 
 CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-07-10 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062812#comment-13062812
 ] 

Mck SembWever commented on CASSANDRA-1125:
--

+1 (tested) on 1125-v3.txt

 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0

 Attachments: 1125-formatted.txt, 1125-v3.txt, CASSANDRA-1125.patch, 
 CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query (using a KeyRange)

2011-07-10 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062814#comment-13062814
 ] 

Mck SembWever commented on CASSANDRA-1125:
--

Created CASSANDRA-2878 for the better solution using a IndexClause

 Filter out ColumnFamily rows that aren't part of the query (using a KeyRange)
 -

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0

 Attachments: 1125-formatted.txt, 1125-v3.txt, CASSANDRA-1125.patch, 
 CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-07-04 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059401#comment-13059401
 ] 

Mck SembWever commented on CASSANDRA-1125:
--

bq. using KeyRange but with tokens (which Thrift also uses for start-exclusive)
this is my preference. i'll make a patch for it.

 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0

 Attachments: 1125-formatted.txt, CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-07-03 Thread Christian Hubmann (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059200#comment-13059200
 ] 

Christian Hubmann commented on CASSANDRA-1125:
--

Haha, yeah I follow this issue too... i just think that its going to be a
pain to have to rewrite the pig scripts for the additional transient
timebucket;-)




 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0

 Attachments: 1125-formatted.txt, CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-07-02 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059098#comment-13059098
 ] 

Jeremy Hanna commented on CASSANDRA-1125:
-

So does this only include key ranges - that's what it sounds like.  And indexes 
are out for now too, it sounds like - e.g. where timebucket = 12345.

 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0

 Attachments: 1125-formatted.txt, CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-07-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059114#comment-13059114
 ] 

Jonathan Ellis commented on CASSANDRA-1125:
---

Like Mck said, that will have to be split into another ticket, since it 
continues to depend on CASSANDRA-1600.

 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0

 Attachments: 1125-formatted.txt, CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-07-01 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058859#comment-13058859
 ] 

Jonathan Ellis commented on CASSANDRA-1125:
---

(And I'd be fine with putting this in 0.8.x.)

 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0

 Attachments: 1125-formatted.txt, CASSANDRA-1125.patch


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever commented on CASSANDRA-1125:
--

I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
limits in CFIF.

And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}}

But both approaches can't be combined. So i guess ConfigHelper could have 
methods setInputKeyRange(..) and setInputIndexClause(..) which are mutually 
exclusive to call.



 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-01-03 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976876#action_12976876
 ] 

Mck SembWever commented on CASSANDRA-1125:
--

Jonathan: do you mean the IndexExpression and IndexClause and 
Table.open(Keyspace1).getColumnFamilyStore(Indexed1).scan(clause, filter);
 inside of ColumnFamilyRecordReader.maybeInit() ??


 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 0.7.1


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-01-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976920#action_12976920
 ] 

Jonathan Ellis commented on CASSANDRA-1125:
---

I think everything I wanted to do here is covered by CASSANDRA-1600, but we 
weren't able to reach consensus on that for 0.7 so I tabled it.

 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 0.7.1


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2010-11-24 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935260#action_12935260
 ] 

Mck SembWever commented on CASSANDRA-1125:
--

this would be a very nice feature.
it has been brought up 
http://thread.gmane.org/gmane.comp.db.cassandra.user/4965 and 
http://thread.gmane.org/gmane.comp.db.cassandra.user/6135


 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Matthew F. Dennis
Priority: Minor
 Fix For: 0.7.1


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2010-05-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12872354#action_12872354
 ] 

Jonathan Ellis commented on CASSANDRA-1125:
---

another option would be to use the RowPredicate with CASSANDRA-749

 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Jeremy Hanna
Priority: Minor
 Fix For: 0.7


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.