[ 
https://issues.apache.org/jira/browse/ARROW-4231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740299#comment-16740299
 ] 

Tanveer commented on ARROW-4231:
--------------------------------

Data is same in both cases but the problem is with ´message data´, when a C++ 
recordbatch is wrapped in plasma object and read in Java, the problem occurs.

C++
////////////////////////////////////////////////////////////////////////////////////////////////////////

arrow::MemoryPool* pool = arrow::default_memory_pool();
arrow::Int32Builder flag_builder(pool);

std::vector<int32_t> flags_data =  \{0, 163, 0, 83, 0, 16};

for (uint32_t i = 0; i < flags_data.size(); i++) {
  auto flags = flags_data[i];
  flag_builder.Append(flags);
}

  shared_ptr<arrow::Array> flag_array;
  flag_builder.Finish(&flag_array);

  std::vector<std::shared_ptr<arrow::Field>> schema_vector = {
    arrow::field("flags", arrow::int32())
  };
  auto schema = std::make_shared<arrow::Schema>(schema_vector);
  shared_ptr<arrow::RecordBatch> batch = arrow::RecordBatch::Make(schema, 1, 
\{flag_array});

  std::shared_ptr<arrow::ResizableBuffer> resizable_buffer;
  arrow::AllocateResizableBuffer(arrow::default_memory_pool(), 0, 
&resizable_buffer)

  auto buffer = std::dynamic_pointer_cast<arrow::Buffer>(resizable_buffer);
  arrow::ipc::SerializeRecordBatch(*batch, arrow::default_memory_pool(), 
&buffer)


Java
////////////////////////////////////////////////////////////////////////////////////////////////////////

int[] values = new int[]\{0, 163, 0, 83, 0, 16};

BufferAllocator alloc1 = new RootAllocator(Long.MAX_VALUE);
ArrowBuf valuesb = intBuf(values);

ArrowRecordBatch batch = new ArrowRecordBatch(6, Lists.newArrayList(new 
ArrowFieldNode(6, 0)), Lists.newArrayList(valuesb));

ByteArrayOutputStream out = new ByteArrayOutputStream();
MessageSerializer.serialize(new WriteChannel(Channels.newChannel(out)), batch);


////////////////////////////////////////////////////////////////////////////////////////////////////////

> C++ & Java recordbatch serialization protocol does not match
> ------------------------------------------------------------
>
>                 Key: ARROW-4231
>                 URL: https://issues.apache.org/jira/browse/ARROW-4231
>             Project: Apache Arrow
>          Issue Type: Bug
>            Reporter: Tanveer
>            Priority: Major
>
> I have a simple array of int32(){color:#333333}  \{0, 163, 0, 83, 0, 
> 16}{color}
> In C++ the output ArrowBuffer of serialized record batch is:
> -116 000200000000120220 6050801201200003302400024000000000   
> 10024012040801000060000160001000000000002000000000000000000000000000 
> 24000000000001000600000000000000000000000-93000000083000000016000
> In Java, with the same array the output of serialized record batch in 
> WriteChannel out  is: 
>  124 000200000000120220 140210160401200024000000000301600003 
> 1002401208040100002000040000600000000000100000000000                 
> 24000000000001000600000000000000000000000-93000000083000000016000
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to