[ 
https://issues.apache.org/jira/browse/ARROW-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701921#comment-16701921
 ] 

Antoine Pitrou commented on ARROW-3303:
---------------------------------------

Do you have an idea what the JSON could look like?

Let's try to sketch examples. Simple int32 array:
{code:json}
{
"type": "int32",
"values": [4,5,-1,0]
}
{code}

With nulls:
{code:json}
{
"type": "int32",
"values": [0,null,-111,234]
}
{code}

A second-granularity timestamp array:
{code:json}
{
"type": "timestamp[s]",
"values": [1543409509]
}
{code}

A list(float64) array with nulls:
{code:json}
{
"type": "list(float64)",
"values": [null, [1.5, 2.5, 3.0], [null, 34.5]]
}
{code}

So the API would look like:
{code:c++}
Status ArrayFromJSON(const std::string& json_string, std::shared_ptr<Array>* 
out);
{code}

Another possibility would be to pass the type programmatically:
{code:c++}
Status ArrayFromJSON(const DataType*, const std::string& json_string, 
std::shared_ptr<Array>* out);
{code}

In this case the JSON string would be much simpler, e.g.:
{code:json}
[0,null,-111,234]
{code}

So you could have a one-liner or almost:
{code:c++}
ASSERT_OK(ArrayFromJSON(int64(), "[0,null,-111,234]", &array1));
ASSERT_OK(ArrayFromJSON(list(int64()), "[null, [1,2,3], [4, null, 6]]", 
&array2));
{code}

Actually both APIs could be useful, depending on the situation...


> [C++] Enable example arrays to be written with a simplified JSON 
> representation
> -------------------------------------------------------------------------------
>
>                 Key: ARROW-3303
>                 URL: https://issues.apache.org/jira/browse/ARROW-3303
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>             Fix For: 0.12.0
>
>
> In addition to making it easier to generate random data as described in 
> ARROW-2329, I think it would be useful to reduce some of the boilerplate 
> associated with writing down explicit test cases. The benefits of this will 
> be especially pronounced when writing nested arrays. 
> Example code that could be improved this way:
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/array-test.cc#L3271
> Rather than having a ton of hand-written assertions, we could compare with 
> the expected true dataset. Of course, this itself has to be tested 
> endogenously, but I think we can write enough tests for the JSON parser bit 
> to be able to have confidence in tests that are written with it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to