Is it actually a good idea to send complex nested data structures over arrow?  
Don’t you loose a lot of it’s benefits?

Just asking why? Genuinely curious as I have some data that is ostensibly time 
series (a time series of trade positions, each snapshot is a hashmap differing 
in cardinality so it can’t feasibly be split into a stream per key).

> On 11 May 2022, at 16:27, Gavin Ray <[email protected]> wrote:
> 
> 
> > For the Request, we have a Protobuf message that consists of two strings: 
> > the GraphQL query and an optional JSON string for variable definitions. We 
> > marshal the protobuf message to bytes which are used as the “ticket” for 
> > the `DoGet` request through Flight.
> 
> Ah okay, so for the request you would just follow the standard "/graphql" 
> query object, with/without "operationName"
> 
> >  Because Arrow already contains complex types like nested structs / lists / 
> > etc. it’s not too difficult to construct an arrow Schema from the expected 
> > GraphQL response schema and just return a stream of record batches.
> 
> Nice to know this is doable, it seemed like it might be overly complicated to 
> write the transform
> 
> > Since pretty much every existing GraphQL engine outputs JSON right now, 
> > we’ve essentially built our own execution engine at this point by utilizing 
> > the planner from https://pkg.go.dev/github.com/jensneuse/graphql-go-tools 
> > and a custom built execution layer to execute the generated plan. 
> 
> This library is really neat, thanks for posting.
> Seems to have wrapper datafetchers too for REST/GQL/static datasources, which 
> is nice.
> 
> I'd be using "graphql-java", where a resolver/datafetcher can return 
> arbitrary types so
> I don't think the JSON bit would be a hangup at least -- could directly 
> return record batches from query execution.
> 
> 
>> On Wed, May 11, 2022 at 10:55 AM Matthew Topol <[email protected]> wrote:
>> So I’m actually doing this currently in production for a service, as I spoke 
>> about in a talk at the Subsurface conference.
>> 
>>  
>> 
>> For the Request, we have a Protobuf message that consists of two strings: 
>> the GraphQL query and an optional JSON string for variable definitions. We 
>> marshal the protobuf message to bytes which are used as the “ticket” for the 
>> `DoGet` request through Flight.
>> 
>>  
>> 
>> Since pretty much every existing GraphQL engine outputs JSON right now, 
>> we’ve essentially built our own execution engine at this point by utilizing 
>> the planner from https://pkg.go.dev/github.com/jensneuse/graphql-go-tools 
>> and a custom built execution layer to execute the generated plan. Because 
>> Arrow already contains complex types like nested structs / lists / etc. it’s 
>> not too difficult to construct an arrow Schema from the expected GraphQL 
>> response schema and just return a stream of record batches.
>> 
>>  
>> 
>> --Matt
>> 
>>  
>> 
>> From: Gavin Ray <[email protected]>
>> Sent: Wednesday, May 11, 2022 10:30 AM
>> To: [email protected]
>> Subject: GraphQL over Arrow (+ Flight)?
>> 
>>  
>> 
>> If anyone is familiar with both GraphQL and Arrow, I'm curious how exactly 
>> using these two together might look GraphQL is transport-agnostic, so you 
>> can theoretically use it over anything, a good case study being Dan Luu's 
>> article here:
>> 
>> If anyone is familiar with both GraphQL and Arrow, I'm curious how exactly 
>> using
>> 
>> these two together might look
>> 
>>  
>> 
>> GraphQL is transport-agnostic, so you can theoretically use it over 
>> anything, a
>> 
>> good case study being Dan Luu's article here:
>> 
>>  
>> 
>> https://danluu.com/simple-architectures/
>> 
>>  
>> 
>>   > "Some areas where we’re happy with our choices even though they may not
>> 
>>   > sound like the simplest feasible solution are with our API, where we use
>> 
>>   > GraphQL, with our transport protocols, where we had a custom protocol 
>> for a
>> 
>>   > while, and our host management, where we use Kubernetes. For our 
>> transport
>> 
>>   > protocols, we used to use a custom protocol that runs on top of UDP, 
>> with an
>> 
>>   > SMS and USSD fallback, for the performance reasons described in this 
>> talk.
>> 
>>   > With the rollout of HTTP/3, we’ve been able to replace our custom 
>> protocol
>> 
>>   > with HTTP/3 and we generally only need USSD for events like the recent
>> 
>>   > internet shutdowns in Mali)."
>> 
>>  
>> 
>> I've seen also GraphQL done over Protobuf/gRPC, TCP/MsgPack, and a custom 
>> binary
>> 
>> format:
>> 
>>  
>> 
>> - 
>> https://github.com/google/rejoiner/blob/b1cb09e9bbf7ac68bfd9c93f23a73b691e6ead72/examples-gradle/src/main/java/com/google/api/graphql/examples/streaming/graphqlserver/GraphQlGrpcServer.java#L44
>> 
>> - https://github.com/OlegIlyenko/sangria-tcp-msgpack-example
>> 
>> - https://github.com/esseswann/graphql-binary
>> 
>>  
>> 
>> If someone were interested in using Arrow as the encoding layer, how would 
>> this
>> 
>> work in practice?
>> 
>>  
>> 
>> Arrow messages need to have a well-defined schema, and GraphQL
>> 
>> queries return dynamic, nested data, so I'm having a hard time understanding 
>> how
>> 
>> you'd go about representing/encoding that in an Arrow message.
>> 
>>  
>> 
>> Thank you =)

Reply via email to