Re: [Question] Invitation to Join Beam Slack Channel

2023-02-13 Thread Joe Crowley
Hi,
For anyone reading this with the ability, i too would like to be invited to
the slack. Thanks also!

On Tue, 14 Feb 2023 at 05:37, Leonardo Reis  wrote:

> Hi, I would like to learn more by joining the Beam Slack Channel. But it
> seems I can't join with my personal email and I don't have the @apache.org
> email either. I would like to ask if I can get invited to the Slack
> channel, or maybe there is another way to join?
>
> Thanks! Regards
>


[Question] Invitation to Join Beam Slack Channel

2023-02-13 Thread Leonardo Reis
Hi, I would like to learn more by joining the Beam Slack Channel. But it
seems I can't join with my personal email and I don't have the @apache.org
email either. I would like to ask if I can get invited to the Slack
channel, or maybe there is another way to join?

Thanks! Regards


Fwd: Beam SQL Extensions Windowing question/problem

2023-02-13 Thread Wiśniowski Piotr

Hi,

I am quite new to Calcite or ZetaSQL SQL dialects, but I did read all 
the doc I could find and could not make below code work. Any ideas what 
I could be doing wrongly? (Beam 2.44 or latest main builds behave the 
same, DirectRunner)


```

CREATE EXTERNAL TABLE pub_sub_example(
    event_timestamp TIMESTAMP,
    `type` VARCHAR,
    `value` INTEGER
)
TYPE pubsub
LOCATION 'projects/some-project/topics/etl-raw-user-behaviour'
TBLPROPERTIES '{"format":"json"}';

```

Below properly returns results:

```SELECT * FROM pub_sub_example LIMIT 2;```

But below ones works indefinitely

```

SELECT
    `type` AS type_kind,
    TUMBLE_START(event_timestamp, INTERVAL '1' SECOND) AS win,
    COUNT(*) AS count_of_messages
FROM pub_sub_example
GROUP BY TUMBLE(event_timestamp, INTERVAL '1' SECOND), `type`
LIMIT 2;

```

Moreover when I do:

```

CREATE EXTERNAL TABLE pub_sub_hist (
    type_kind VARCHAR,
    win TIMESTAMP,
    count_of_messages INTEGER
)
TYPE parquet
LOCATION 'gs://etl-user-behaviour/output'
TBLPROPERTIES '{"file_name_suffix":".parquet", 
"shard_name_template":"/S_of_N", 
"mime_type":"application/vnd.apache.parquet"}'

;

INSERT INTO
SELECT
    `type` AS type_kind,
    TUMBLE_START(event_timestamp, INTERVAL '1' SECOND) AS win,
    COUNT(*) AS count_of_messages
FROM pub_sub_example
GROUP BY TUMBLE(event_timestamp, INTERVAL '1' SECOND), `type`;

```

I do get error that I cannot understand:

groupByKey cannot be applied to non-bounded PCollection in the 
GlobalWindow without a trigger. Use a Window.into or Window.triggering 
transform prior to GroupByKey. (state=,code=0)


I think that I am providing a window as `TUMBLE(event_timestamp, 
INTERVAL '1' SECOND)` in the exact `GROUP BY` step but it still threats 
the table as in `GlobalWindow`?


Moreover I did find in Calcite SQL docs that this kind of windowing is 
depreciated: 
https://calcite.apache.org/docs/reference.html#grouped-window-functions 
and even DataFlow SQL examples do use new syntax 
(https://cloud.google.com/dataflow/docs/reference/sql/streaming-extensions) 
that is present also in Calcite 
(https://calcite.apache.org/docs/reference.html#tumble) but Beam SQL 
does not parse it. Any idea what is the roadmap for this functionalities?


Really appreciate quick reply.

Best regards

Wiśniowski Piotr