[
https://issues.apache.org/jira/browse/ARROW-4231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740299#comment-16740299
]
Tanveer commented on ARROW-4231:
--------------------------------
Data is same in both cases but the problem is with ´message data´, when a C++
recordbatch is wrapped in plasma object and read in Java, the problem occurs.
C++
////////////////////////////////////////////////////////////////////////////////////////////////////////
arrow::MemoryPool* pool = arrow::default_memory_pool();
arrow::Int32Builder flag_builder(pool);
std::vector<int32_t> flags_data = \{0, 163, 0, 83, 0, 16};
for (uint32_t i = 0; i < flags_data.size(); i++) {
auto flags = flags_data[i];
flag_builder.Append(flags);
}
shared_ptr<arrow::Array> flag_array;
flag_builder.Finish(&flag_array);
std::vector<std::shared_ptr<arrow::Field>> schema_vector = {
arrow::field("flags", arrow::int32())
};
auto schema = std::make_shared<arrow::Schema>(schema_vector);
shared_ptr<arrow::RecordBatch> batch = arrow::RecordBatch::Make(schema, 1,
\{flag_array});
std::shared_ptr<arrow::ResizableBuffer> resizable_buffer;
arrow::AllocateResizableBuffer(arrow::default_memory_pool(), 0,
&resizable_buffer)
auto buffer = std::dynamic_pointer_cast<arrow::Buffer>(resizable_buffer);
arrow::ipc::SerializeRecordBatch(*batch, arrow::default_memory_pool(),
&buffer)
Java
////////////////////////////////////////////////////////////////////////////////////////////////////////
int[] values = new int[]\{0, 163, 0, 83, 0, 16};
BufferAllocator alloc1 = new RootAllocator(Long.MAX_VALUE);
ArrowBuf valuesb = intBuf(values);
ArrowRecordBatch batch = new ArrowRecordBatch(6, Lists.newArrayList(new
ArrowFieldNode(6, 0)), Lists.newArrayList(valuesb));
ByteArrayOutputStream out = new ByteArrayOutputStream();
MessageSerializer.serialize(new WriteChannel(Channels.newChannel(out)), batch);
////////////////////////////////////////////////////////////////////////////////////////////////////////
> C++ & Java recordbatch serialization protocol does not match
> ------------------------------------------------------------
>
> Key: ARROW-4231
> URL: https://issues.apache.org/jira/browse/ARROW-4231
> Project: Apache Arrow
> Issue Type: Bug
> Reporter: Tanveer
> Priority: Major
>
> I have a simple array of int32(){color:#333333} \{0, 163, 0, 83, 0,
> 16}{color}
> In C++ the output ArrowBuffer of serialized record batch is:
> -116 000200000000120220 6050801201200003302400024000000000
> 10024012040801000060000160001000000000002000000000000000000000000000
> 24000000000001000600000000000000000000000-93000000083000000016000
> In Java, with the same array the output of serialized record batch in
> WriteChannel out is:
> 124 000200000000120220 140210160401200024000000000301600003
> 1002401208040100002000040000600000000000100000000000
> 24000000000001000600000000000000000000000-93000000083000000016000
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)