Apache Pinot Daily Email Digest (2020-08-26)

Pinot Slack Email Digest Wed, 26 Aug 2020 19:00:30 -0700

<h3><u>#general</u></h3><br><strong>@sjeetsingh2801: </strong>Hi 
guys!<br><strong>@adrian.f.cole: </strong>Hi, question about ServiceManager.. 
is it a feature or bug to have multiple instances of the same role in bootstrap 
services? eg. 2 minions<br><strong>@samarth: </strong>Is there a tool / utility 
to read segment data . modify it and upload . I was looking to generate bulk 
data for query performance testing pinot . Thanks ..<br><strong>@daniel.kocot: 
</strong>@daniel.kocot has joined the channel<br><strong>@wu.yj0616: 
</strong>@wu.yj0616 has joined the channel<br><strong>@vanshchaudhary: 
</strong>@vanshchaudhary has joined the channel<br><strong>@bmrja: 
</strong>@bmrja has joined the channel<br><strong>@daniel.kocot: </strong>Hi 
there :wave:<br><strong>@syed.zeeshan.ahmed: </strong>@syed.zeeshan.ahmed has 
joined the channel<br><h3><u>#random</u></h3><br><strong>@daniel.kocot: 
</strong>@daniel.kocot has joined the channel<br><strong>@wu.yj0616: 
</strong>@wu.yj0616 has joined the channel<br><strong>@vanshchaudhary: 
</strong>@vanshchaudhary has joined the channel<br><strong>@bmrja: 
</strong>@bmrja has joined the channel<br><strong>@syed.zeeshan.ahmed: 
</strong>@syed.zeeshan.ahmed has joined the 
channel<br><h3><u>#feat-presto-connector</u></h3><br><strong>@vkryuchkov: 
</strong>@vkryuchkov has joined the 
channel<br><h3><u>#troubleshooting</u></h3><br><strong>@npawar: </strong>it 
works for both<br><strong>@npawar: </strong>have you put epochMinutes in your 
schema?<br><strong>@pradeepgv42: </strong>yeah it’s part of the 
schema<br><strong>@pradeepgv42: </strong>let me try with the latest 
code<br><strong>@pradeepgv42: </strong>Actually with the latest code it worked, 
I was using slightly older version from master<br><strong>@arici: </strong>When 
I use Hour() function like below.
```SELECT "timestamp",variant_id,sum(amount) FROM Sales WHERE operator_id = 1 
AND campaign_id = 1 GROUP BY Hour("timestamp"), variant_id```
I get the following error. It says 'timestamp' should appear in GROUP BY 
clause. But it already does.
```ProcessingException(errorCode:150, message:PQLParsingError:
org.apache.pinot.sql.parsers.SqlCompilationException: 'timestamp' should appear 
in GROUP BY clause.
        at 
org.apache.pinot.sql.parsers.CalciteSqlParser.validateGroupByClause(CalciteSqlParser.java:177)
        at 
org.apache.pinot.sql.parsers.CalciteSqlParser.validate(CalciteSqlParser.java:114)
        at 
org.apache.pinot.sql.parsers.CalciteSqlParser.queryRewrite(CalciteSqlParser.java:364)
        at 
org.apache.pinot.sql.parsers.CalciteSqlParser.compileCalciteSqlToPinotQuery(CalciteSqlParser.java:338)
        at 
org.apache.pinot.sql.parsers.CalciteSqlParser.compileToPinotQuery(CalciteSqlParser.java:104)
        at 
org.apache.pinot.sql.parsers.CalciteSqlCompiler.compileToBrokerRequest(CalciteSqlCompiler.java:33)
        at 
org.apache.pinot.controller.api.resources.PinotQueryResource.getQueryResponse(PinotQueryResource.java:158)
        at 
org.apache.pinot.controller.api.resources.PinotQueryResource.handlePostSql(PinotQueryResource.java:131)
        at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
        at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124)
        at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167))```<br><strong>@g.kishore:
 </strong>It’s not valid sql. You are selecting a column but grouping by a 
function on that column<br><strong>@arici: </strong>Also when I get rid of sum 
aggregation it works perfectly. (Is this a valid sql?)
```SELECT "timestamp",variant_id FROM Sales WHERE operator_id = 1 AND 
campaign_id = 1 GROUP BY Hour("timestamp"), variant_id```<br><strong>@arici: 
</strong>When I bring in the sum aggregation, I get the error 
above.<br><strong>@yash.agarwal: </strong>How to achieve something similar to 
`local.directory.sequence.id=true` in `SparkSegmentGenerationJobRunner` 
?<br><strong>@elon.azoulay: </strong>Hi, great talk yesterday! Have a question 
about the new date time field spec. I created a simple test table with the 
schema below but it does not seem to be bucketing into seconds (i.e. 
granularity). I inserted a time value every 500 ms to verify and did not see 
that the `ts` column is bucketed - is there another name for the "bucketed" 
column?


Here is the schema:
```{
  "schemaName" : "myTable2",
  "dimensionFieldSpecs" : [ {
    "name" : "col1",
    "dataType" : "LONG",
    "defaultNullValue" : 0
  } ],
  "metricFieldSpecs" : [ {
    "name" : "m1",
    "dataType" : "LONG"
  } ],
  "dateTimeFieldSpecs" : [ {
    "name" : "ts",
    "dataType" : "LONG",
    "format" : "1:MILLISECONDS:EPOCH",
    "granularity" : "1:SECONDS"
  } ]
}```
And here is the data:
```"resultTable": {
        "dataSchema": {
            "columnDataTypes": ["LONG", "LONG", "LONG"],
            "columnNames": ["col1", "m1", "ts"]
        },
        "rows": [
            [0, 0, 1598431406131],
            [1, 1000, 1598431406634],
            [2, 2000, 1598431407134],
            [3, 3000, 1598431407634],
            [4, 4000, 1598431408134],
            [5, 5000, 1598431408634],
            [6, 6000, 1598431409134],
            [7, 7000, 1598431409634],
            [8, 8000, 1598431410134],
            [9, 9000, 1598431410634]
        ]
    },```<br><strong>@g.kishore: </strong>you need to use 
transformFunction<br><strong>@elon.azoulay: </strong>Thanks! I saw examples in 
the unit tests, would I be using one of the functions with the `bucketed` 
suffix or just the regular `toEpochSeconds` transform?<br><strong>@g.kishore: 
</strong>@npawar ^^<br><strong>@npawar: </strong>if you want to round to the 
nearest seconds, but still have it in millis, then use `"tsSeconds" : 
"round(ts, 1000)"` in your transform function<br><strong>@npawar: </strong>if 
you want to convert it to secondsSinceEpoch, then you can use 
toEpochSeconds<br><strong>@npawar: </strong>and set that in your table config 
like this: 
<https://u17000708.ct.sendgrid.net/ls/click?upn=1BiFF0-2FtVRazUn1cLzaiMdTeAXadp8BL3QinSdRtJdpjVdLs6pcP-2BmJYu0RGOJgjW4yUWessZqxMl9cgpZEQvZ61b7e-2FULIdDgh3zQyBB6rFDS3rtq9FjWH9nlZ0hZ99VBNms8zq1x7TdCDGsaZg5i-2F97vZlpfEl6fip6rjl-2BUw-3DKTW8_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTx1Zqjzg7h1GWrR9dxdr41nIlX5XgY4a7lmIKAYAmSJ8jxcdrosniuMabULDHSRR53N25lWt5Z31h4KoVp0i4rJrVHsZu6FU6E8TNzm0D2BD8msKdvc7zf0k0N5Bzp8r0ofAJyGHGjnAYV41H-2Fxd6BJablox7APRcEugKYHEM7-2Fvzn1ns-2FOgmeRyfH6r36NMAQ-3D><br><strong>@elon.azoulay:
 </strong>Thanks!<br><strong>@elon.azoulay: </strong>Did `streamConfigs` change 
to `ingestionConfig` in 0.5.0?<br><strong>@elon.azoulay: </strong>Or is it an 
additional config?<br><strong>@elon.azoulay: </strong>So the granularity by 
itself doesn't do anything by itself, a transform function is needed also, 
right?<br><strong>@npawar: </strong>it is additional config<br><strong>@npawar: 
</strong>correct about granularity<br><strong>@elon.azoulay: </strong>That 
really helps, thanks:) Also, great talk yesterday!<br><strong>@npawar: 
</strong>thank you! :slightly_smiling_face:<br><strong>@elon.azoulay: 
</strong>Sorry, another date time field spec question: can you have multiple 
date time columns bucketed on different granularities based on one source 
column?<br><h3><u>#lp-pinot-poc</u></h3><br><strong>@fx19880617: </strong>I got 
the docker build working, there is one missing thing about the istio 
context<br><strong>@fx19880617: 
</strong><https://u17000708.ct.sendgrid.net/ls/click?upn=1BiFF0-2FtVRazUn1cLzaiMSpQpf1vBNN7VUyMr5oI-2BtCAcH-2BA3iHH8tMnNcU0bwQJ7jx5-2B09UtwlYTMtvS0x-2BsWBHHckiDZf6ZXToOxKfdc-2Bq3yORTyU-2FdOxOeg-2FLAvytiO4o_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTx1Zqjzg7h1GWrR9dxdr41nJBzgxN1syEs-2FjZcdBcX2k2p-2BoZjD3xoguKHwd-2FDD5tTFhIF7T50C7VXevEzr-2FjLkEVXjOnyxkQl8MNozVDRkHBubRLzMEAxJK-2FsaxliNIGLMiFIkEH6mo2UBvZCuAoFqdq7Zzz-2BpP-2Ff7l-2FOrxpCXDgU7a87kfNRRfwXoLetdPDE-3D><br>

Apache Pinot Daily Email Digest (2020-08-26)

Reply via email to