[
https://issues.apache.org/jira/browse/ARROW-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376441#comment-16376441
]
Paul Taylor commented on ARROW-2202:
------------------------------------
[~bhulette] since {{RecordBatch}} extends {{StructVector}}, and
{{StructVectors}} implement {{toJSON()}}, we should be able to implement
{{toJSON()}} on the Table by calling {{toJSON()}} on each inner
{{RecordBatch}}. This would yield rows as compact Arrays of each value, but we
could also apply the {{MapView}} to each RecordBatch if we wanted rows as JS
Objects of key/value pairs instead. Alternatively, we could refactor the
internal
[{{tableRowsToString()}}|https://github.com/apache/arrow/blob/master/js/src/table.ts#L317]
method to yield compact rows.
Since serializing table rows to JS Arrays or Objects can easily exceed the
memory limit of a single node process, it's probably worth exposing a row
generator function. I ran into this issue piping the {{toString()}} result to
the console, so we can follow the same pattern here.
It also might be valuable to name it something else, and reserve {{toJSON()}}
for generating the Arrow JSON format. The rational here is that {{toJSON()}} is
automatically invoked by {{JSON.stringify()}}, which is most commonly used for
serialization and deserialization, making this possible:
{code:java}
const newTable = Table.from(JSON.parse(JSON.stringify(oldTable)));
{code}
> [JS] Add DataFrame.toJSON
> -------------------------
>
> Key: ARROW-2202
> URL: https://issues.apache.org/jira/browse/ARROW-2202
> Project: Apache Arrow
> Issue Type: Improvement
> Components: JavaScript
> Reporter: Brian Hulette
> Priority: Major
>
> Currently, {{CountByResult}} has its own [{{toJSON}}
> method|https://github.com/apache/arrow/blob/master/js/src/table.ts#L282], but
> there should be a more general one for every {{DataFrame}}.
> {{CountByResult.toJSON}} returns:
> {code:json}
> {
> "keyA": 10,
> "keyB": 10,
> ...
> }{code}
> A more general {{toJSON}} could just return a list of objects with an entry
> for each column. For the above {{CountByResult}}, the output would look like:
> {code:json}
> [
> {value: "keyA", count: 10},
> {value: "keyB", count: 10},
> ...
> ]{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)