[jira] [Created] (ARROW-7517) [C++] Builder does not honour dictionary type provided during initialization

2020-01-08 Thread Wamsi Viswanath (Jira)
Wamsi Viswanath created ARROW-7517:
--

 Summary: [C++] Builder does not honour dictionary type provided 
during initialization
 Key: ARROW-7517
 URL: https://issues.apache.org/jira/browse/ARROW-7517
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.15.0
Reporter: Wamsi Viswanath


Below is an example for reproducing the issue:

[https://gist.github.com/wamsiv/d48ec37a9a9b5f4d484de6ff86a3870d]

Builder automatically optimizes the dictionary type depending upon the number 
of unique values provided which results in schema mismatch.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-7512) Dictionary memo missing elements in id_to_dictionary_ map after deserialization

2020-01-07 Thread Wamsi Viswanath (Jira)
Wamsi Viswanath created ARROW-7512:
--

 Summary: Dictionary memo missing elements in id_to_dictionary_ map 
after deserialization
 Key: ARROW-7512
 URL: https://issues.apache.org/jira/browse/ARROW-7512
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.15.0
Reporter: Wamsi Viswanath


`id_to_dictionary_` map is empty after de-serialization of schema using 
ReadSchema method.

An example for reproduction:

[https://gist.github.com/wamsiv/77dc1db44b5805828172e6c94d61d2d9]

I see that it is probably being missed here: 
https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/metadata_internal.cc#L804

Please let me know if the behavior is expected and if so then how the client is 
expected to have dictionary array values?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6470) [C++] Segmentation fault when trying to serialzie empty SerializeRecordBatch

2019-09-05 Thread Wamsi Viswanath (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923808#comment-16923808
 ] 

Wamsi Viswanath commented on ARROW-6470:


Thanks, I just wanted to make sure it was the correct behavior.

> [C++] Segmentation fault when trying to serialzie empty SerializeRecordBatch 
> -
>
> Key: ARROW-6470
> URL: https://issues.apache.org/jira/browse/ARROW-6470
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.13.0
>Reporter: Wamsi Viswanath
>Priority: Major
>
> Below is a simple reproducible example, please let me know if the behavior is 
> valid:
>  ```
> #include 
>  #include 
>  #include 
>  #include 
>  #include 
> int main()
> { std::shared_ptr schema = 
> arrow::schema(\\{arrow::field("int_", arrow::int32(), false)}
> );
>  std::vector> arrays = {};
> std::shared_ptr record_batch =
>  arrow::RecordBatch::Make(schema, arrays[0]->length(), arrays);
>  std::shared_ptr serialized_buffer;
>  if (!arrow::ipc::SerializeRecordBatch(
>  *record_batch, arrow::default_memory_pool(), _buffer)
>  .ok())
> { throw std::runtime_error("Error: Serializing Records."); }
> }
> ```



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (ARROW-6470) Segmentation fault when trying to serialzie empty SerializeRecordBatch

2019-09-05 Thread Wamsi Viswanath (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wamsi Viswanath updated ARROW-6470:
---
Description: 
Below is a simple reproducible example, please let me know if the behavior is 
valid:

 ```

#include 
#include 
#include 
#include 
#include 

int main() {
 std::shared_ptr schema =
 arrow::schema(\{arrow::field("int_", arrow::int32(), false)});
 std::vector> arrays = {};

std::shared_ptr record_batch =
 arrow::RecordBatch::Make(schema, arrays[0]->length(), arrays);
 std::shared_ptr serialized_buffer;
 if (!arrow::ipc::SerializeRecordBatch(
 *record_batch, arrow::default_memory_pool(), _buffer)
 .ok()) {
 throw std::runtime_error("Error: Serializing Records.");
 }
}

```

  was:
Below is a simple reproducible example, please let me know if the behavior is 
valid:

 ```
 {color:#ffa759}int{color} {color:#ffd580}main{color}{color:#cbccc6}(){color} 
{color:#cbccc6}{{color}
 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Schema{color}{color:#f29e74}>{color}{color:#cbccc6}
 schema {color}{color:#f29e74}={color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}schema{color}{color:#cbccc6}({{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}field{color}{color:#cbccc6}({color}{color:#bae67e}"int_"{color}{color:#cbccc6},{color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}int32{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}
 {color:#ffcc66}false{color}{color:#cbccc6})}){color}{color:#cbccc6};{color}
 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}vector{color}{color:#f29e74}<{color}{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Array{color}{color:#f29e74}>>{color}{color:#cbccc6}
 arrays {color}{color:#f29e74}={color} 
{color:#cbccc6}{}{color}{color:#cbccc6};{color}

{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}RecordBatch{color}{color:#f29e74}>{color}{color:#cbccc6}
 record_batch {color}{color:#f29e74}={color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#73d0ff}RecordBatch{color}{color:#cbccc6}::{color}{color:#ffd580}Make{color}{color:#cbccc6}({color}{color:#cbccc6}schema{color}{color:#cbccc6},{color}
 
{color:#cbccc6}arrays{color}{color:#cbccc6}[{color}{color:#ffcc66}0{color}{color:#cbccc6}]{color}{color:#cbccc6}->{color}{color:#ffd580}length{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}{color:#cbccc6}
 arrays{color}{color:#cbccc6}){color}{color:#cbccc6};{color}
 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Buffer{color}{color:#f29e74}>{color}{color:#cbccc6}
 serialized_buffer{color}{color:#cbccc6};{color}
 {color:#ffa759}if{color} 
{color:#cbccc6}({color}{color:#f29e74}!{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#73d0ff}ipc{color}{color:#cbccc6}::{color}{color:#ffd580}SerializeRecordBatch{color}{color:#cbccc6}({color}
 
{color:#f29e74}*{color}{color:#cbccc6}record_batch{color}{color:#cbccc6},{color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}default_memory_pool{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}
 
{color:#f29e74}&{color}{color:#cbccc6}serialized_buffer{color}{color:#cbccc6}){color}
 {color:#cbccc6} .{color}{color:#ffd580}ok{color}{color:#cbccc6}()){color} 
{color:#cbccc6}{{color}
 {color:#ffa759}throw{color} 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#ffd580}runtime_error{color}{color:#cbccc6}({color}{color:#bae67e}"Error:
 Serializing Records."{color}{color:#cbccc6}){color}{color:#cbccc6};{color}
 }

```


> Segmentation fault when trying to serialzie empty SerializeRecordBatch 
> ---
>
> Key: ARROW-6470
> URL: https://issues.apache.org/jira/browse/ARROW-6470
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.13.0
>Reporter: Wamsi Viswanath
>Priority: Major
>
> Below is a simple reproducible example, please let me know if the behavior is 
> valid:
>  ```
> #include 
> #include 
> #include 
> #include 
> #include 
> int main() {
>  std::shared_ptr schema =
>  arrow::schema(\{arrow::field("int_", arrow::int32(), false)});
>  std::vector> arrays = {};
> std::shared_ptr record_batch =
>  arrow::RecordBatch::Make(schema, arrays[0]->length(), arrays);
>  std::shared_ptr serialized_buffer;
>  if (!arrow::ipc::SerializeRecordBatch(
>  

[jira] [Updated] (ARROW-6470) Segmentation fault when trying to serialzie empty SerializeRecordBatch

2019-09-05 Thread Wamsi Viswanath (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wamsi Viswanath updated ARROW-6470:
---
Description: 
Below is a simple reproducible example, please let me know if the behavior is 
valid:

 ```

#include 
 #include 
 #include 
 #include 
 #include 

int main()

{ std::shared_ptr schema = arrow::schema(\\{arrow::field("int_", 
arrow::int32(), false)}

);
 std::vector> arrays = {};

std::shared_ptr record_batch =
 arrow::RecordBatch::Make(schema, arrays[0]->length(), arrays);
 std::shared_ptr serialized_buffer;
 if (!arrow::ipc::SerializeRecordBatch(
 *record_batch, arrow::default_memory_pool(), _buffer)
 .ok())

{ throw std::runtime_error("Error: Serializing Records."); }

}

```

  was:
Below is a simple reproducible example, please let me know if the behavior is 
valid:

 ```

#include 
#include 
#include 
#include 
#include 

int main() {
 std::shared_ptr schema =
 arrow::schema(\{arrow::field("int_", arrow::int32(), false)});
 std::vector> arrays = {};

std::shared_ptr record_batch =
 arrow::RecordBatch::Make(schema, arrays[0]->length(), arrays);
 std::shared_ptr serialized_buffer;
 if (!arrow::ipc::SerializeRecordBatch(
 *record_batch, arrow::default_memory_pool(), _buffer)
 .ok()) {
 throw std::runtime_error("Error: Serializing Records.");
 }
}

```


> Segmentation fault when trying to serialzie empty SerializeRecordBatch 
> ---
>
> Key: ARROW-6470
> URL: https://issues.apache.org/jira/browse/ARROW-6470
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.13.0
>Reporter: Wamsi Viswanath
>Priority: Major
>
> Below is a simple reproducible example, please let me know if the behavior is 
> valid:
>  ```
> #include 
>  #include 
>  #include 
>  #include 
>  #include 
> int main()
> { std::shared_ptr schema = 
> arrow::schema(\\{arrow::field("int_", arrow::int32(), false)}
> );
>  std::vector> arrays = {};
> std::shared_ptr record_batch =
>  arrow::RecordBatch::Make(schema, arrays[0]->length(), arrays);
>  std::shared_ptr serialized_buffer;
>  if (!arrow::ipc::SerializeRecordBatch(
>  *record_batch, arrow::default_memory_pool(), _buffer)
>  .ok())
> { throw std::runtime_error("Error: Serializing Records."); }
> }
> ```



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (ARROW-6470) Segmentation fault when trying to serialzie empty SerializeRecordBatch

2019-09-05 Thread Wamsi Viswanath (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wamsi Viswanath updated ARROW-6470:
---
Description: 
Below is a simple reproducible example, please let me know if the behavior is 
valid:

 ```
 {color:#ffa759}int{color} {color:#ffd580}main{color}{color:#cbccc6}(){color} 
{color:#cbccc6}{{color}
 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Schema{color}{color:#f29e74}>{color}{color:#cbccc6}
 schema {color}{color:#f29e74}={color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}schema{color}{color:#cbccc6}({{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}field{color}{color:#cbccc6}({color}{color:#bae67e}"int_"{color}{color:#cbccc6},{color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}int32{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}
 {color:#ffcc66}false{color}{color:#cbccc6})}){color}{color:#cbccc6};{color}
 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}vector{color}{color:#f29e74}<{color}{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Array{color}{color:#f29e74}>>{color}{color:#cbccc6}
 arrays {color}{color:#f29e74}={color} 
{color:#cbccc6}{}{color}{color:#cbccc6};{color}

{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}RecordBatch{color}{color:#f29e74}>{color}{color:#cbccc6}
 record_batch {color}{color:#f29e74}={color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#73d0ff}RecordBatch{color}{color:#cbccc6}::{color}{color:#ffd580}Make{color}{color:#cbccc6}({color}{color:#cbccc6}schema{color}{color:#cbccc6},{color}
 
{color:#cbccc6}arrays{color}{color:#cbccc6}[{color}{color:#ffcc66}0{color}{color:#cbccc6}]{color}{color:#cbccc6}->{color}{color:#ffd580}length{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}{color:#cbccc6}
 arrays{color}{color:#cbccc6}){color}{color:#cbccc6};{color}
 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Buffer{color}{color:#f29e74}>{color}{color:#cbccc6}
 serialized_buffer{color}{color:#cbccc6};{color}
 {color:#ffa759}if{color} 
{color:#cbccc6}({color}{color:#f29e74}!{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#73d0ff}ipc{color}{color:#cbccc6}::{color}{color:#ffd580}SerializeRecordBatch{color}{color:#cbccc6}({color}
 
{color:#f29e74}*{color}{color:#cbccc6}record_batch{color}{color:#cbccc6},{color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}default_memory_pool{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}
 
{color:#f29e74}&{color}{color:#cbccc6}serialized_buffer{color}{color:#cbccc6}){color}
 {color:#cbccc6} .{color}{color:#ffd580}ok{color}{color:#cbccc6}()){color} 
{color:#cbccc6}{{color}
 {color:#ffa759}throw{color} 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#ffd580}runtime_error{color}{color:#cbccc6}({color}{color:#bae67e}"Error:
 Serializing Records."{color}{color:#cbccc6}){color}{color:#cbccc6};{color}
 }

```

  was:
Below is a simple reproducible example, please let me know if the behavior is 
valid:

 
{color:#ffa759}int{color} {color:#ffd580}main{color}{color:#cbccc6}(){color} 
{color:#cbccc6}{{color}
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Schema{color}{color:#f29e74}>{color}{color:#cbccc6}
 schema {color}{color:#f29e74}={color}
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}schema{color}{color:#cbccc6}({{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}field{color}{color:#cbccc6}({color}{color:#bae67e}"int_"{color}{color:#cbccc6},{color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}int32{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}
 {color:#ffcc66}false{color}{color:#cbccc6})}){color}{color:#cbccc6};{color}
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}vector{color}{color:#f29e74}<{color}{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Array{color}{color:#f29e74}>>{color}{color:#cbccc6}
 arrays {color}{color:#f29e74}={color} 
{color:#cbccc6}{}{color}{color:#cbccc6};{color}


[jira] [Created] (ARROW-6470) Segmentation fault when trying to serialzie empty SerializeRecordBatch

2019-09-05 Thread Wamsi Viswanath (Jira)
Wamsi Viswanath created ARROW-6470:
--

 Summary: Segmentation fault when trying to serialzie empty 
SerializeRecordBatch 
 Key: ARROW-6470
 URL: https://issues.apache.org/jira/browse/ARROW-6470
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.13.0
Reporter: Wamsi Viswanath


Below is a simple reproducible example, please let me know if the behavior is 
valid:

 
{color:#ffa759}int{color} {color:#ffd580}main{color}{color:#cbccc6}(){color} 
{color:#cbccc6}{{color}
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Schema{color}{color:#f29e74}>{color}{color:#cbccc6}
 schema {color}{color:#f29e74}={color}
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}schema{color}{color:#cbccc6}({{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}field{color}{color:#cbccc6}({color}{color:#bae67e}"int_"{color}{color:#cbccc6},{color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}int32{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}
 {color:#ffcc66}false{color}{color:#cbccc6})}){color}{color:#cbccc6};{color}
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}vector{color}{color:#f29e74}<{color}{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Array{color}{color:#f29e74}>>{color}{color:#cbccc6}
 arrays {color}{color:#f29e74}={color} 
{color:#cbccc6}{}{color}{color:#cbccc6};{color}

{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}RecordBatch{color}{color:#f29e74}>{color}{color:#cbccc6}
 record_batch {color}{color:#f29e74}={color}
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#73d0ff}RecordBatch{color}{color:#cbccc6}::{color}{color:#ffd580}Make{color}{color:#cbccc6}({color}{color:#cbccc6}schema{color}{color:#cbccc6},{color}
 
{color:#cbccc6}arrays{color}{color:#cbccc6}[{color}{color:#ffcc66}0{color}{color:#cbccc6}]{color}{color:#cbccc6}->{color}{color:#ffd580}length{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}{color:#cbccc6}
 arrays{color}{color:#cbccc6}){color}{color:#cbccc6};{color}
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#cbccc6}shared_ptr{color}{color:#f29e74}<{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#cbccc6}Buffer{color}{color:#f29e74}>{color}{color:#cbccc6}
 serialized_buffer{color}{color:#cbccc6};{color}
{color:#ffa759}if{color} 
{color:#cbccc6}({color}{color:#f29e74}!{color}{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#73d0ff}ipc{color}{color:#cbccc6}::{color}{color:#ffd580}SerializeRecordBatch{color}{color:#cbccc6}({color}
{color:#f29e74}*{color}{color:#cbccc6}record_batch{color}{color:#cbccc6},{color}
 
{color:#73d0ff}arrow{color}{color:#cbccc6}::{color}{color:#ffd580}default_memory_pool{color}{color:#cbccc6}(){color}{color:#cbccc6},{color}
 
{color:#f29e74}&{color}{color:#cbccc6}serialized_buffer{color}{color:#cbccc6}){color}
{color:#cbccc6} .{color}{color:#ffd580}ok{color}{color:#cbccc6}()){color} 
{color:#cbccc6}{{color}
{color:#ffa759}throw{color} 
{color:#73d0ff}std{color}{color:#cbccc6}::{color}{color:#ffd580}runtime_error{color}{color:#cbccc6}({color}{color:#bae67e}"Error:
 Serializing Records."{color}{color:#cbccc6}){color}{color:#cbccc6};{color}
{color:#cbccc6}}{color}
{color:#cbccc6}}{color}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)