[ 
https://issues.apache.org/jira/browse/ARROW-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376441#comment-16376441
 ] 

Paul Taylor commented on ARROW-2202:
------------------------------------

[~bhulette] since {{RecordBatch}} extends {{StructVector}}, and 
{{StructVectors}} implement {{toJSON()}}, we should be able to implement 
{{toJSON()}} on the Table by calling {{toJSON()}} on each inner 
{{RecordBatch}}. This would yield rows as compact Arrays of each value, but we 
could also apply the {{MapView}} to each RecordBatch if we wanted rows as JS 
Objects of key/value pairs instead. Alternatively, we could refactor the 
internal 
[{{tableRowsToString()}}|https://github.com/apache/arrow/blob/master/js/src/table.ts#L317]
 method to yield compact rows.

Since serializing table rows to JS Arrays or Objects can easily exceed the 
memory limit of a single node process, it's probably worth exposing a row 
generator function. I ran into this issue piping the {{toString()}} result to 
the console, so we can follow the same pattern here.

It also might be valuable to name it something else, and reserve {{toJSON()}} 
for generating the Arrow JSON format. The rational here is that {{toJSON()}} is 
automatically invoked by {{JSON.stringify()}}, which is most commonly used for 
serialization and deserialization, making this possible:
{code:java}
const newTable = Table.from(JSON.parse(JSON.stringify(oldTable)));
{code}

> [JS] Add DataFrame.toJSON
> -------------------------
>
>                 Key: ARROW-2202
>                 URL: https://issues.apache.org/jira/browse/ARROW-2202
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: JavaScript
>            Reporter: Brian Hulette
>            Priority: Major
>
> Currently, {{CountByResult}} has its own [{{toJSON}} 
> method|https://github.com/apache/arrow/blob/master/js/src/table.ts#L282], but 
> there should be a more general one for every {{DataFrame}}.
> {{CountByResult.toJSON}} returns:
> {code:json}
> {
>   "keyA": 10,
>   "keyB": 10,
>   ...
> }{code}
> A more general {{toJSON}} could just return a list of objects with an entry 
> for each column. For the above {{CountByResult}}, the output would look like:
> {code:json}
> [
>   {value: "keyA", count: 10},
>   {value: "keyB", count: 10},
>   ...
> ]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to