Re: Joining Avro records

2015-04-09 Thread Roger Hoover
Thanks, Julian. Good point about needing aliasing for unique names in SQL. I didn't know about array_agg...nice. On Thu, Apr 9, 2015 at 12:35 PM, Julian Hyde wrote: > Much of this is about mapping from logical fields (i.e. the fields you can > reference in SQL) down to the Avro representation;

Re: Joining Avro records

2015-04-09 Thread Roger Hoover
Yi Pan, Thanks for your response. I'm thinking that I'll iterate over the fields of the input schemas (similar to this https://github.com/apache/samza/blob/samza-sql/samza-sql/src/main/java/org/apache/samza/sql/metadata/AvroSchemaConverter.java#L58-L62), match them up with the output schema and t

Re: Joining Avro records

2015-04-09 Thread Julian Hyde
Much of this is about mapping from logical fields (i.e. the fields you can reference in SQL) down to the Avro representation; I’m no expert on that mapping, so I’ll focus on the SQL stuff. First, SQL doesn’t allow a record to have two fields of the same name, so you wouldn’t be allowed to have

Re: Joining Avro records

2015-04-09 Thread Yi Pan
Hi, Roger, Good question on that. I am actually not aware of any "automatic" way of doing this in Avro. I have tried to add generic Schema and Data interface in samza-sql branch to address the morphing of the schemas from input streams to the output streams. The basic idea is to have wrapper Schem

Joining Avro records

2015-04-09 Thread Roger Hoover
Hi Milinda and others, This is an Avro question but since you guys are working on Avro support for stream SQL, I thought I'd ask you for help. If I have a two records of type A and B as below and want to join them similar to "SELECT *" in SQL to produce a record of type AB, is there an simple way

Re: Newbie questions after completing "Hello Samza" about performance and project setup

2015-04-09 Thread Roger Hoover
Hi Warren, Yes, I think Hello Samza is the template project to work from. I believe that the slow message rate that you are seeing is because it's subscribed to the the wikipedia IRC stream which may only generate a few events per second. That said, some of the example configuration for the hell

Newbie questions after completing "Hello Samza" about performance and project setup

2015-04-09 Thread Warren Henning
Hi, I ran the commands in http://samza.apache.org/startup/hello-samza/0.9/ successfully. Fascinating stuff! I was running all the processes on my (fairly recent model) Macbook Pro. One aspect I've heard about Kafka and Samza is performance -- handling thousands of messages a second. E.g., http://