Re: possible cell buffer size issue

2022-01-28 Thread Neophytos Demetriou
Thank you Bowen, I ended up using the type of the cell to get the string
for now.

On Fri, Jan 28, 2022 at 5:01 AM Bowen Song  wrote:

> Just FYI, you may want to do put the return value of cell.buffer() in a
> variable instead of calling it twice, because there's no guarantee that you
> will get the same (cached) ByteBuffer object on the second call. Also, you
> may want to do a rewind() first, just in case...
> On 28/01/2022 09:22, Neophytos Demetriou wrote:
>
> I've solved the issue with the following for the time being:
>
> byte[] arr = new byte[cell.buffer().remaining()];cell.buffer().get(arr);
>
> I shouldn't have been calling array() in the first place it seems.
>
> - Neophytos
>
> On Fri, Jan 28, 2022 at 2:06 AM Neophytos Demetriou 
> wrote:
>
>> Hi, thanks for the prompt reply.
>>
>> I've tried this. Here's what I'm writing:
>> bytes: 3 capacity: 3 limit: 3 offset: 0
>>
>> Here's what I'm reading:
>> cell buffer size: 1048576 capacity: 1048576 limit: 212 arrayOffset: 0
>>
>> It still does not seem right. I would have expected Cassandra to allocate
>> a buffer the size of the text field. Unless I'm missing something,
>> org.apache.cassandra.db.marshal.AbstractType#read already does this. It
>> calls org.apache.cassandra.utils.ByteBufferUtil#read that allocates a byte
>> array the size of the given length. I'm still checking but it could be that
>> the readUnsignedVInt call in AbstractType#read reads the wrong thing under
>> the given circumstances (very likely an issue on my end). I would welcome
>> any ideas on how to debug this.
>>
>> - Neophytos
>>
>> On Thu, Jan 27, 2022 at 5:43 PM Bowen Song  wrote:
>>
>>> I'm not a Java developer, but based on my best knowledge,
>>> ByteBuffer.array() method returns the whole byte array, not just the part
>>> of the byte array that's meaningful (i.e. has ever been written to). You
>>> may want to check the difference between the bb.capacity() and bb.limit(),
>>> and also check the bb.arrayOffset() because the first element is not always
>>> at beginning of the byte array.
>>> On 27/01/2022 22:11, Neophytos Demetriou wrote:
>>>
>>> Hi,
>>>
>>> I'm new to the list but not new to Cassandra. I'm writing an app on top
>>> of C* and I have come across an issue (huge cell buffer size after applying
>>> a mutation) that I haven't been able to resolve yet. I would appreciate any
>>> suggestions/help to resolve this. Here are the details:
>>>
>>> 1. I have a column family defined as follows:
>>>
>>> TableMetadata.Builder metadata =
>>> TableMetadata
>>> .builder(KEYSPACE1, CF_STANDARD1)
>>> .addPartitionKeyColumn("key", Int32Type.instance)
>>> .addRegularColumn(
>>> "a",MapType.getInstance(AsciiType.instance, 
>>> SetType.getInstance(UTF8Type.instance,false),false))
>>> .addRegularColumn("b", UTF8Type.instance);
>>>
>>> 2. And here's a test that I wrote and works on cassandra-4.0 branch:
>>>
>>> Row.Builder builder = 
>>> BTreeRow.unsortedBuilder();builder.newRow(Clustering.EMPTY);ColumnMetadata 
>>> def = metadata.getColumn(new ColumnIdentifier("b", true));Cell cell = 
>>> BufferCell.live(def, System.currentTimeMillis(), 
>>> UTF8Type.instance.decompose("/b1"));builder.addCell(cell);PartitionUpdate 
>>> update = PartitionUpdate.singleRowUpdate(metadata, dk, builder.build());new 
>>> Mutation(update).apply();Row row = Util.getOnlyRow(Util.cmd(cfs, 
>>> dk).withLimit(1).build());assertEquals(3, 
>>> row.getCell(def).buffer().array().length);
>>>
>>> 3. However, in my app when I do the getOnlyRow after applying the
>>> mutation the string value of b is 3 but the buffer().array().length is
>>> 1048576.
>>>
>>> 4. Restarting the app (which starts the cassandra daemon), fixes the
>>> issue i.e. getOnlyRow returns the correct buffer size.
>>>
>>> 5. I'm importing cassandra-all 4.0.1 and the app uses jdk-11.
>>>
>>> If you need further info, please do not hesitate to ask.
>>>
>>> - Neophytos
>>>
>>> PS. I'm experimenting with C* internals for the first time so it's very
>>> likely I'm doing something wrong.
>>>
>>>
>>>


Re: possible cell buffer size issue

2022-01-28 Thread Bowen Song
Just FYI, you may want to do put the return value of cell.buffer() in a 
variable instead of calling it twice, because there's no guarantee that 
you will get the same (cached) ByteBuffer object on the second call. 
Also, you may want to do a rewind() first, just in case...


On 28/01/2022 09:22, Neophytos Demetriou wrote:

I've solved the issue with the following for the time being:
byte[] arr =new byte[cell.buffer().remaining()]; cell.buffer().get(arr);
I shouldn't have been calling array() in the first place it seems.

- Neophytos

On Fri, Jan 28, 2022 at 2:06 AM Neophytos Demetriou 
 wrote:


Hi, thanks for the prompt reply.

I've tried this. Here's what I'm writing:
bytes: 3 capacity: 3 limit: 3 offset: 0

Here's what I'm reading:
cell buffer size: 1048576 capacity: 1048576 limit: 212 arrayOffset: 0

It still does not seem right. I would have expected Cassandra to
allocate a buffer the size of the text field. Unless I'm missing
something, org.apache.cassandra.db.marshal.AbstractType#read
already does this. It calls
org.apache.cassandra.utils.ByteBufferUtil#read that allocates a
byte array the size of the given length. I'm still checking but it
could be that the readUnsignedVInt call in AbstractType#read reads
the wrong thing under the given circumstances (very likely an
issue on my end). I would welcome any ideas on how to debug this.

- Neophytos

On Thu, Jan 27, 2022 at 5:43 PM Bowen Song  wrote:

I'm not a Java developer, but based on my best knowledge,
ByteBuffer.array() method returns the whole byte array, not
just the part of the byte array that's meaningful (i.e. has
ever been written to). You may want to check the difference
between the bb.capacity() and bb.limit(), and also check the
bb.arrayOffset() because the first element is not always at
beginning of the byte array.

On 27/01/2022 22:11, Neophytos Demetriou wrote:

Hi,

I'm new to the list but not new to Cassandra. I'm writing an
app on top of C* and I have come across an issue (huge cell
buffer size after applying a mutation) that I haven't been
able to resolve yet. I would appreciate any suggestions/help
to resolve this. Here are the details:

1. I have a column family defined as follows:
TableMetadata.Builder metadata =
TableMetadata
 .builder(KEYSPACE1, CF_STANDARD1)
 .addPartitionKeyColumn("key", Int32Type.instance)
 .addRegularColumn(
 "a", MapType.getInstance(AsciiType.instance, 
SetType.getInstance(UTF8Type.instance,false),false))
 .addRegularColumn("b", UTF8Type.instance);
2. And here's a test that I wrote and works on cassandra-4.0
branch:
Row.Builder builder = BTreeRow.unsortedBuilder(); builder.newRow(Clustering.EMPTY); 
ColumnMetadata def =metadata.getColumn(new ColumnIdentifier("b", true)); Cell cell = 
BufferCell.live(def, System.currentTimeMillis(), UTF8Type.instance.decompose("/b1")); 
builder.addCell(cell); PartitionUpdate update = PartitionUpdate.singleRowUpdate(metadata, dk, 
builder.build()); new Mutation(update).apply(); Row row = Util.getOnlyRow(Util.cmd(cfs, 
dk).withLimit(1).build()); assertEquals(3, row.getCell(def).buffer().array().length);
3. However, in my app when I do the getOnlyRow after applying
the mutation the string value of b is 3 but the
buffer().array().length is 1048576.

4. Restarting the app (which starts the cassandra daemon),
fixes the issue i.e. getOnlyRow returns the correct buffer size.

5. I'm importing cassandra-all 4.0.1 and the app uses jdk-11.

If you need further info, please do not hesitate to ask.

- Neophytos

PS. I'm experimenting with C* internals for the first time so
it's very likely I'm doing something wrong.



Re: possible cell buffer size issue

2022-01-28 Thread Neophytos Demetriou
I've solved the issue with the following for the time being:

byte[] arr = new byte[cell.buffer().remaining()];
cell.buffer().get(arr);

I shouldn't have been calling array() in the first place it seems.

- Neophytos

On Fri, Jan 28, 2022 at 2:06 AM Neophytos Demetriou 
wrote:

> Hi, thanks for the prompt reply.
>
> I've tried this. Here's what I'm writing:
> bytes: 3 capacity: 3 limit: 3 offset: 0
>
> Here's what I'm reading:
> cell buffer size: 1048576 capacity: 1048576 limit: 212 arrayOffset: 0
>
> It still does not seem right. I would have expected Cassandra to allocate
> a buffer the size of the text field. Unless I'm missing something,
> org.apache.cassandra.db.marshal.AbstractType#read already does this. It
> calls org.apache.cassandra.utils.ByteBufferUtil#read that allocates a byte
> array the size of the given length. I'm still checking but it could be that
> the readUnsignedVInt call in AbstractType#read reads the wrong thing under
> the given circumstances (very likely an issue on my end). I would welcome
> any ideas on how to debug this.
>
> - Neophytos
>
> On Thu, Jan 27, 2022 at 5:43 PM Bowen Song  wrote:
>
>> I'm not a Java developer, but based on my best knowledge,
>> ByteBuffer.array() method returns the whole byte array, not just the part
>> of the byte array that's meaningful (i.e. has ever been written to). You
>> may want to check the difference between the bb.capacity() and bb.limit(),
>> and also check the bb.arrayOffset() because the first element is not always
>> at beginning of the byte array.
>> On 27/01/2022 22:11, Neophytos Demetriou wrote:
>>
>> Hi,
>>
>> I'm new to the list but not new to Cassandra. I'm writing an app on top
>> of C* and I have come across an issue (huge cell buffer size after applying
>> a mutation) that I haven't been able to resolve yet. I would appreciate any
>> suggestions/help to resolve this. Here are the details:
>>
>> 1. I have a column family defined as follows:
>>
>> TableMetadata.Builder metadata =
>> TableMetadata
>> .builder(KEYSPACE1, CF_STANDARD1)
>> .addPartitionKeyColumn("key", Int32Type.instance)
>> .addRegularColumn(
>> "a",MapType.getInstance(AsciiType.instance, 
>> SetType.getInstance(UTF8Type.instance,false),false))
>> .addRegularColumn("b", UTF8Type.instance);
>>
>> 2. And here's a test that I wrote and works on cassandra-4.0 branch:
>>
>> Row.Builder builder = 
>> BTreeRow.unsortedBuilder();builder.newRow(Clustering.EMPTY);ColumnMetadata 
>> def = metadata.getColumn(new ColumnIdentifier("b", true));Cell cell = 
>> BufferCell.live(def, System.currentTimeMillis(), 
>> UTF8Type.instance.decompose("/b1"));builder.addCell(cell);PartitionUpdate 
>> update = PartitionUpdate.singleRowUpdate(metadata, dk, builder.build());new 
>> Mutation(update).apply();Row row = Util.getOnlyRow(Util.cmd(cfs, 
>> dk).withLimit(1).build());assertEquals(3, 
>> row.getCell(def).buffer().array().length);
>>
>> 3. However, in my app when I do the getOnlyRow after applying the
>> mutation the string value of b is 3 but the buffer().array().length is
>> 1048576.
>>
>> 4. Restarting the app (which starts the cassandra daemon), fixes the
>> issue i.e. getOnlyRow returns the correct buffer size.
>>
>> 5. I'm importing cassandra-all 4.0.1 and the app uses jdk-11.
>>
>> If you need further info, please do not hesitate to ask.
>>
>> - Neophytos
>>
>> PS. I'm experimenting with C* internals for the first time so it's very
>> likely I'm doing something wrong.
>>
>>
>>


Re: possible cell buffer size issue

2022-01-27 Thread Neophytos Demetriou
Hi, thanks for the prompt reply.

I've tried this. Here's what I'm writing:
bytes: 3 capacity: 3 limit: 3 offset: 0

Here's what I'm reading:
cell buffer size: 1048576 capacity: 1048576 limit: 212 arrayOffset: 0

It still does not seem right. I would have expected Cassandra to allocate a
buffer the size of the text field. Unless I'm missing something,
org.apache.cassandra.db.marshal.AbstractType#read already does this. It
calls org.apache.cassandra.utils.ByteBufferUtil#read that allocates a byte
array the size of the given length. I'm still checking but it could be that
the readUnsignedVInt call in AbstractType#read reads the wrong thing under
the given circumstances (very likely an issue on my end). I would welcome
any ideas on how to debug this.

- Neophytos

On Thu, Jan 27, 2022 at 5:43 PM Bowen Song  wrote:

> I'm not a Java developer, but based on my best knowledge,
> ByteBuffer.array() method returns the whole byte array, not just the part
> of the byte array that's meaningful (i.e. has ever been written to). You
> may want to check the difference between the bb.capacity() and bb.limit(),
> and also check the bb.arrayOffset() because the first element is not always
> at beginning of the byte array.
> On 27/01/2022 22:11, Neophytos Demetriou wrote:
>
> Hi,
>
> I'm new to the list but not new to Cassandra. I'm writing an app on top of
> C* and I have come across an issue (huge cell buffer size after applying a
> mutation) that I haven't been able to resolve yet. I would appreciate any
> suggestions/help to resolve this. Here are the details:
>
> 1. I have a column family defined as follows:
>
> TableMetadata.Builder metadata =
> TableMetadata
> .builder(KEYSPACE1, CF_STANDARD1)
> .addPartitionKeyColumn("key", Int32Type.instance)
> .addRegularColumn(
> "a",MapType.getInstance(AsciiType.instance, 
> SetType.getInstance(UTF8Type.instance,false),false))
> .addRegularColumn("b", UTF8Type.instance);
>
> 2. And here's a test that I wrote and works on cassandra-4.0 branch:
>
> Row.Builder builder = 
> BTreeRow.unsortedBuilder();builder.newRow(Clustering.EMPTY);ColumnMetadata 
> def = metadata.getColumn(new ColumnIdentifier("b", true));Cell cell = 
> BufferCell.live(def, System.currentTimeMillis(), 
> UTF8Type.instance.decompose("/b1"));builder.addCell(cell);PartitionUpdate 
> update = PartitionUpdate.singleRowUpdate(metadata, dk, builder.build());new 
> Mutation(update).apply();Row row = Util.getOnlyRow(Util.cmd(cfs, 
> dk).withLimit(1).build());assertEquals(3, 
> row.getCell(def).buffer().array().length);
>
> 3. However, in my app when I do the getOnlyRow after applying the mutation
> the string value of b is 3 but the buffer().array().length is 1048576.
>
> 4. Restarting the app (which starts the cassandra daemon), fixes the issue
> i.e. getOnlyRow returns the correct buffer size.
>
> 5. I'm importing cassandra-all 4.0.1 and the app uses jdk-11.
>
> If you need further info, please do not hesitate to ask.
>
> - Neophytos
>
> PS. I'm experimenting with C* internals for the first time so it's very
> likely I'm doing something wrong.
>
>
>


Re: possible cell buffer size issue

2022-01-27 Thread Bowen Song
I'm not a Java developer, but based on my best knowledge, 
ByteBuffer.array() method returns the whole byte array, not just the 
part of the byte array that's meaningful (i.e. has ever been written 
to). You may want to check the difference between the bb.capacity() and 
bb.limit(), and also check the bb.arrayOffset() because the first 
element is not always at beginning of the byte array.


On 27/01/2022 22:11, Neophytos Demetriou wrote:

Hi,

I'm new to the list but not new to Cassandra. I'm writing an app on 
top of C* and I have come across an issue (huge cell buffer size after 
applying a mutation) that I haven't been able to resolve yet. I would 
appreciate any suggestions/help to resolve this. Here are the details:


1. I have a column family defined as follows:
TableMetadata.Builder metadata =
TableMetadata
 .builder(KEYSPACE1, CF_STANDARD1)
 .addPartitionKeyColumn("key", Int32Type.instance)
 .addRegularColumn(
 "a", MapType.getInstance(AsciiType.instance, 
SetType.getInstance(UTF8Type.instance,false),false))
 .addRegularColumn("b", UTF8Type.instance);
2. And here's a test that I wrote and works on cassandra-4.0 branch:
Row.Builder builder = BTreeRow.unsortedBuilder(); builder.newRow(Clustering.EMPTY); ColumnMetadata def 
=metadata.getColumn(new ColumnIdentifier("b", true)); Cell cell = BufferCell.live(def, 
System.currentTimeMillis(), UTF8Type.instance.decompose("/b1")); builder.addCell(cell); 
PartitionUpdate update = PartitionUpdate.singleRowUpdate(metadata, dk, builder.build()); new 
Mutation(update).apply(); Row row = Util.getOnlyRow(Util.cmd(cfs, dk).withLimit(1).build()); 
assertEquals(3, row.getCell(def).buffer().array().length);
3. However, in my app when I do the getOnlyRow after applying the 
mutation the string value of b is 3 but the buffer().array().length is 
1048576.


4. Restarting the app (which starts the cassandra daemon), fixes the 
issue i.e. getOnlyRow returns the correct buffer size.


5. I'm importing cassandra-all 4.0.1 and the app uses jdk-11.

If you need further info, please do not hesitate to ask.

- Neophytos

PS. I'm experimenting with C* internals for the first time so it's 
very likely I'm doing something wrong.