[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-06-03 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14570443#comment-14570443
 ] 

Sam Tunnicliffe commented on CASSANDRA-8609:


I set it to ERROR because WARN includes lots of noise from the 
registering/unregistering of Hadoop metrics mbeans. It was helpful for me to 
filter that stuff out but it's easy to do yourself when running the tests 
locally so I'm fine with reverting the change to logback-test.xml if it helps 
with diagnosing failures on cassci. 

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
 Fix For: 2.2.0 rc1

 Attachments: 8609-2.2-2.txt, 8609-2.2.txt, 
 CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-06-03 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14570856#comment-14570856
 ] 

Philip Thompson commented on CASSANDRA-8609:


In that case, +1 to the patch, with logback set to WARN.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
 Fix For: 2.2.0 rc1

 Attachments: 8609-2.2-2.txt, 8609-2.2.txt, 
 CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-06-02 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14569536#comment-14569536
 ] 

Sam Tunnicliffe commented on CASSANDRA-8609:


Not just yet I'm afraid, there are still some dependencies on internals like 
{{IPartitioner}}, {{AbstractType}} in determining splits. 

Ideally, the Hadoop integration code shouldn't have any dependencies beyond the 
cql driver and thrift (and client-utils). For that to happen though, we'll need 
to duplicate a few internal things to enable the split sizing done entirely 
client side along the lines of 
[SPARKC-94|https://datastax-oss.atlassian.net/browse/SPARKC-94]). Maybe that 
should be done as part of CASSANDRA-9353.

There are also some dependencies in {{CqlBulkRecordWriter}} ({{CFMetaData}}, 
{{Config}}, {{DatabaseDescriptor}}) which can't easily be removed yet.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
 Fix For: 2.2.0 rc1

 Attachments: 8609-2.2-2.txt, 8609-2.2.txt, 
 CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-06-02 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14569469#comment-14569469
 ] 

Sam Tunnicliffe commented on CASSANDRA-8609:


Pushed a new branch with the changes mentioned above, plus some further cleanup 
to the bundled examples. 
I've removed {{AbstractCassandraStorage}} as it seems it should have actually 
gone in CASSANDRA-8358.
Diff 
[here|https://github.com/apache/cassandra/compare/cassandra-2.2...beobal:8609-2.2]
  I'll update with test results when cassci has run.


 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
 Fix For: 2.2.0 rc1

 Attachments: 8609-2.2-2.txt, 8609-2.2.txt, 
 CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-06-02 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14569473#comment-14569473
 ] 

Jeremiah Jordan commented on CASSANDRA-8609:


bq. I don't think we can completely remove the dependency on internal classes 
in this way as it would remove the ability to write M/R jobs which use 
timestamp and ttl.

[~beobal] can we at least make sure that they are not used anywhere in the non 
thrift code (including common base classes)?  If we rm all the thrift using 
sub-classes we should not need to use any internal stuff, as everything should 
be using classes out of the java driver at that point. no?

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
 Fix For: 2.2.0 rc1

 Attachments: 8609-2.2-2.txt, 8609-2.2.txt, 
 CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-06-01 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567188#comment-14567188
 ] 

Sam Tunnicliffe commented on CASSANDRA-8609:


I don't think we can completely remove the dependency on internal classes in 
this way as it would remove the ability to write M/R jobs which use timestamp 
and ttl. While it doesn't break any of the bundled pig or hadoop examples, it's 
feasible for jobs out in the wild to be doing this. 

I think the right thing to do is to create a new simple class in the 
{{org.apache.cassandra.hadoop}} package to represent a column (much like the 
old {{org.apache.cassandra.db.Column}} from 2.0) and use that throughout the 
thrift side of the hadoop integration. The 
{{ColumnFamilyRecordReader#unthriftifyX}} methods should then be translating 
from the thrift classes into these new simple columns.

Also, the utility of {{AbstractCassandraStorage}} isn't clear to me. 
{{CassandraStorage}} doesn't extend it and I can't find any reference to it in 
the project at all (i.e. it isn't being tested/exercised by any of the demos as 
far as I can tell). Is there any reason why users writing their own 
{{LoadStoreFunc}} would choose to extend {{ACS}} rather than {{CS}}. At the 
very least, shouldn't it be marked deprecated like {{CS}}?

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Philip Thompson
 Fix For: 2.2.0 rc1

 Attachments: 8609-2.2-2.txt, 8609-2.2.txt, 
 CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-05-20 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552968#comment-14552968
 ] 

Philip Thompson commented on CASSANDRA-8609:


Here are the CI results including the fix from CASSANDRA-9442. Once that is 
committed, you can review.

http://cassci.datastax.com/view/Dev/view/ptnapoleon/job/ptnapoleon-cassandra-8609-testall/lastCompletedBuild/testReport/

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Philip Thompson
 Fix For: 2.2.0 rc1

 Attachments: 8609-2.2.txt, CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-05-18 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548154#comment-14548154
 ] 

Aleksey Yeschenko commented on CASSANDRA-8609:
--

With Thrift-based Hadoop code going away in 3.0/CASSANDRA-9353 it's important 
that the 2.2 versions of it can work without any internal dependencies (so that 
you can use the 2.2 versions with 3.0, if you need to).

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Philip Thompson
 Fix For: 2.2 rc1

 Attachments: CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-05-18 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549139#comment-14549139
 ] 

Philip Thompson commented on CASSANDRA-8609:


Waiting on test results from cassci, but here is the patch:

Squashed:
https://github.com/ptnapoleon/cassandra/tree/8609-squashed-notest
Normal:
https://github.com/ptnapoleon/cassandra/tree/cassandra-8609

This is ready for review, but not for commit. I had to merge in CASSANDRA-9410 
due to build issues. Once/if that is committed, I will merge cassandra-2.2 back 
into my branch.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Philip Thompson
 Fix For: 2.2 rc1

 Attachments: 8609-2.2.txt, CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-05-18 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549169#comment-14549169
 ] 

Philip Thompson commented on CASSANDRA-8609:


Oh, you will also notice a probably out of scope change in CqlRecordWriter. 
I've decided it's probably best to move that to another ticket, and block this.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Philip Thompson
 Fix For: 2.2 rc1

 Attachments: 8609-2.2.txt, CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-04-20 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502788#comment-14502788
 ] 

Philip Thompson commented on CASSANDRA-8609:


This is not a duplicate, we will need something additional on top of 
CASSANDRA-8358.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Philip Thompson
 Fix For: 3.0

 Attachments: CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-04-17 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500199#comment-14500199
 ] 

Sylvain Lebresne commented on CASSANDRA-8609:
-

[~philipthompson] So is this just a duplicate of CASSANDRA-8358 or will we need 
something more for this?

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Philip Thompson
 Fix For: 3.0

 Attachments: CASSANDRA-8609-3.0-branch.txt


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-02-12 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317851#comment-14317851
 ] 

Sylvain Lebresne commented on CASSANDRA-8609:
-

[~alexliu68] Any chance you'll be able to work on this soonish? Otherwise we'll 
re-assign as we kind of want that for 3.0.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Alex Liu
 Fix For: 3.0


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-02-12 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319020#comment-14319020
 ] 

Alex Liu commented on CASSANDRA-8609:
-

I tested pig-test on trunk and found some failed test cases, I am fixing those 
in this ticket as well.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Alex Liu
 Fix For: 3.0


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-02-12 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319162#comment-14319162
 ] 

Philip Thompson commented on CASSANDRA-8609:


Alex, I believe most of this will be covered by CASSANDRA-8358. At the very 
least the pig-test failures are fixed there.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Alex Liu
 Fix For: 3.0


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-02-12 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319176#comment-14319176
 ] 

Alex Liu commented on CASSANDRA-8609:
-

All pig tests fail, and CASSANDRA-8358 is addressing the issue.  I attach my 
patch on cassandra-3.0 as a reference for [~philipthompson]. He will take over 
this ticket and address it on CASSANDRA-8358.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Alex Liu
 Fix For: 3.0


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-02-12 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318501#comment-14318501
 ] 

Alex Liu commented on CASSANDRA-8609:
-

Sorry, I miss this ticket. I am working on it today or tomorrow to get it done. 

Do this ticket only remove Cell and CellName from any of Hadoop related class?

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Alex Liu
 Fix For: 3.0


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8609) Remove depency of hadoop to internals (Cell/CellName)

2015-02-12 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318521#comment-14318521
 ] 

Sylvain Lebresne commented on CASSANDRA-8609:
-

bq. Do this ticket only remove Cell and CellName from any of Hadoop related 
class?

Basically, yes.

bq. Sorry, I miss this ticket. I am working on it today or tomorrow to get it 
done.

No problem, thanks.

 Remove depency of hadoop to internals (Cell/CellName)
 -

 Key: CASSANDRA-8609
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8609
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Alex Liu
 Fix For: 3.0


 For some reason most of the Hadoop code (ColumnFamilyRecordReader, 
 CqlStorage, ...) uses the {{Cell}} and {{CellName}} classes. That dependency 
 is entirely artificial: all this code is really client code that communicate 
 with Cassandra over thrift/native protocol and there is thus no reason for it 
 to use internal classes. And in fact, thoses classes are used in a very crude 
 way, as a {{PairByteBuffer, ByteBuffer}} really.
 But this dependency is really painful when we make changes to the internals. 
 Further, every time we do so, I believe we break some of those the APIs due 
 to the change. This has been painful for CASSANDRA-5417 and this is now 
 painful for CASSANDRA-8099. But while I somewhat hack over it in 
 CASSANDRA-5417, this was a mistake and we should have removed the depency 
 back then. So let do that now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)