Polber commented on issue #30498:
URL: https://github.com/apache/beam/issues/30498#issuecomment-2015692629
I am unable to reproduce the error.
I used the following pipeline:
```
Schema SCHEMA = Schema.of(Schema.Field.of("event", Schema.FieldType.STRING));
Pipeline pipeline = Pipeline.create();
PCollection<Row> input = pipeline.apply(
Create.of(
Row.withSchema(SCHEMA).withFieldValue("event", "abc").build()
)
).setRowSchema(SCHEMA);
PCollection<Row> transformed = input.apply(
SqlTransform.query("select event as event_name, count(*) as c from
PCOLLECTION group by event"));
transformed.apply(ParDo.of(new DoFn<Row, Object>() {
@ProcessElement
public void processElement(ProcessContext c) throws Exception {
System.out.println(c.element());
c.output(c.element());
}
})).setRowSchema(
Schema.of(
Schema.Field.of("event", Schema.FieldType.STRING),
Schema.Field.of("c", Schema.FieldType.INT64)
)
);
pipeline.run();
```
Logs:
```
Mar 22, 2024 2:30:06 PM
org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `PCOLLECTION`.`event` AS `event_name`, COUNT(*) AS `c`
FROM `beam`.`PCOLLECTION` AS `PCOLLECTION`
GROUP BY `PCOLLECTION`.`event`
Mar 22, 2024 2:30:07 PM
org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(event_name=[$0], c=[$1])
LogicalAggregate(group=[{0}], c=[COUNT()])
BeamIOSourceRel(table=[[beam, PCOLLECTION]])
Mar 22, 2024 2:30:07 PM
org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..1=[{inputs}], proj#0..1=[{exprs}])
BeamAggregationRel(group=[{0}], c=[COUNT()])
BeamIOSourceRel(table=[[beam, PCOLLECTION]])
Mar 22, 2024 2:30:08 PM
org.apache.beam.sdk.util.construction.Environments$JavaVersion forSpecification
WARNING: Unsupported Java version: 18, falling back to: 17
```
Row:
```
Row:
event_name:abc
c:1
```
If you give me more details on your pipeline, I can try to reproduce, but if
you can model yours more like the one I posted above and see if that resolves
the issue, that would be a good first step.
---
As far as...
> Event_name column is correct, but why C0? and not c?
This is to avoid name collision. In your query, you select a column `c` and
also create a column `c` by aggregating `count(*)`, so this is working as
expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]