[
https://issues.apache.org/jira/browse/ARROW-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701921#comment-16701921
]
Antoine Pitrou commented on ARROW-3303:
---------------------------------------
Do you have an idea what the JSON could look like?
Let's try to sketch examples. Simple int32 array:
{code:json}
{
"type": "int32",
"values": [4,5,-1,0]
}
{code}
With nulls:
{code:json}
{
"type": "int32",
"values": [0,null,-111,234]
}
{code}
A second-granularity timestamp array:
{code:json}
{
"type": "timestamp[s]",
"values": [1543409509]
}
{code}
A list(float64) array with nulls:
{code:json}
{
"type": "list(float64)",
"values": [null, [1.5, 2.5, 3.0], [null, 34.5]]
}
{code}
So the API would look like:
{code:c++}
Status ArrayFromJSON(const std::string& json_string, std::shared_ptr<Array>*
out);
{code}
Another possibility would be to pass the type programmatically:
{code:c++}
Status ArrayFromJSON(const DataType*, const std::string& json_string,
std::shared_ptr<Array>* out);
{code}
In this case the JSON string would be much simpler, e.g.:
{code:json}
[0,null,-111,234]
{code}
So you could have a one-liner or almost:
{code:c++}
ASSERT_OK(ArrayFromJSON(int64(), "[0,null,-111,234]", &array1));
ASSERT_OK(ArrayFromJSON(list(int64()), "[null, [1,2,3], [4, null, 6]]",
&array2));
{code}
Actually both APIs could be useful, depending on the situation...
> [C++] Enable example arrays to be written with a simplified JSON
> representation
> -------------------------------------------------------------------------------
>
> Key: ARROW-3303
> URL: https://issues.apache.org/jira/browse/ARROW-3303
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Wes McKinney
> Priority: Major
> Fix For: 0.12.0
>
>
> In addition to making it easier to generate random data as described in
> ARROW-2329, I think it would be useful to reduce some of the boilerplate
> associated with writing down explicit test cases. The benefits of this will
> be especially pronounced when writing nested arrays.
> Example code that could be improved this way:
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/array-test.cc#L3271
> Rather than having a ton of hand-written assertions, we could compare with
> the expected true dataset. Of course, this itself has to be tested
> endogenously, but I think we can write enough tests for the JSON parser bit
> to be able to have confidence in tests that are written with it
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)