[ 
https://issues.apache.org/jira/browse/ARROW-16316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zsolt Kegyes-Brassai updated ARROW-16316:
-----------------------------------------
    Description: 
I was trying to aggregate over time using different granularity. Usually I 
would use the {{lubridate::floor_date()}} , which is currently not supported 
for parquet datasets.

Is there any comprehensive list of supported list of currently supported 
{{{}lubridate (or {{dplyr{}}}{}}}) verbs? Maybe, it’s only my fault, but except 
the changelog I haven’t find any relevant information.

 

Later I found that the {{round_temporal()}} function is exposed to {{{}R{}}}. 
But I am struggling to find the right syntax inside a mutate statement to apply 
on a {{timestamp[us, tz=UTC]}} type column.
{code:java}
new_dataset |>
  mutate(time = arrow_round_temporal(time))
#>  Error: Invalid: Attempted to initialize KernelState from null 
FunctionOptions
{code}
 

Here are some other attempts:
{code:java}
library(arrow)

arrow_now <- Scalar$create(lubridate::now())
(arrow_now)
#> Scalar
#> 2022-04-25 11:44:33.805609
call_function("round_temporal", arrow_now)
#> Scalar
#> 2022-04-25 00:00:00.000000
call_function("round_temporal", arrow_now, unit = "day")
#> Error: Argument 2 is of class character but it must be one of "Array", 
"ChunkedArray", "RecordBatch", "Table", or "Scalar"
arrow_unit <- Scalar$create("day")
(arrow_unit)
#> Scalar
#> day
call_function("round_temporal", arrow_now, unit = arrow_unit)
#> Error: Invalid: Function 'round_temporal' accepts 1 arguments but attempted 
to look up kernel(s) with 2
{code}
 

  was:
I was trying to aggregate over time using different granularity. Usually I 
would use the {{lubridate::floor_date()}} , which is currently not supported 
for parquet datasets.

Is there any comprehensive list of supported list of currently supported 
{{{}lubridate (or {{dplyr}}{}}}) verbs? Maybe, it’s only my fault, but except 
the changelog I haven’t find any relevant information.

 

Later I found that the {{round_temporal()}} function is exposed to {{{}R{}}}. 
But I am struggling to find the right syntax inside a mutate statement to apply 
on a {{timestamp[us, tz=UTC]}} type column.
{code:java}
new_dataset |>
  mutate(time = arrow_round_temporal(time))
#>  Error: Invalid: Attempted to initialize KernelState from null 
FunctionOptions
{code}
 

 

Here are some other attempts:
{code:java}
library(arrow)

arrow_now <- Scalar$create(lubridate::now())
(arrow_now)
#> Scalar
#> 2022-04-25 11:44:33.805609
call_function("round_temporal", arrow_now)
#> Scalar
#> 2022-04-25 00:00:00.000000
call_function("round_temporal", arrow_now, unit = "day")
#> Error: Argument 2 is of class character but it must be one of "Array", 
"ChunkedArray", "RecordBatch", "Table", or "Scalar"
arrow_unit <- Scalar$create("day")
(arrow_unit)
#> Scalar
#> day
call_function("round_temporal", arrow_now, unit = arrow_unit)
#> Error: Invalid: Function 'round_temporal' accepts 1 arguments but attempted 
to look up kernel(s) with 2
{code}
 


> How to round the timestamps in a mutate statement?
> --------------------------------------------------
>
>                 Key: ARROW-16316
>                 URL: https://issues.apache.org/jira/browse/ARROW-16316
>             Project: Apache Arrow
>          Issue Type: Wish
>    Affects Versions: 7.0.0
>            Reporter: Zsolt Kegyes-Brassai
>            Priority: Minor
>
> I was trying to aggregate over time using different granularity. Usually I 
> would use the {{lubridate::floor_date()}} , which is currently not supported 
> for parquet datasets.
> Is there any comprehensive list of supported list of currently supported 
> {{{}lubridate (or {{dplyr{}}}{}}}) verbs? Maybe, it’s only my fault, but 
> except the changelog I haven’t find any relevant information.
>  
> Later I found that the {{round_temporal()}} function is exposed to {{{}R{}}}. 
> But I am struggling to find the right syntax inside a mutate statement to 
> apply on a {{timestamp[us, tz=UTC]}} type column.
> {code:java}
> new_dataset |>
>   mutate(time = arrow_round_temporal(time))
> #>  Error: Invalid: Attempted to initialize KernelState from null 
> FunctionOptions
> {code}
>  
> Here are some other attempts:
> {code:java}
> library(arrow)
> arrow_now <- Scalar$create(lubridate::now())
> (arrow_now)
> #> Scalar
> #> 2022-04-25 11:44:33.805609
> call_function("round_temporal", arrow_now)
> #> Scalar
> #> 2022-04-25 00:00:00.000000
> call_function("round_temporal", arrow_now, unit = "day")
> #> Error: Argument 2 is of class character but it must be one of "Array", 
> "ChunkedArray", "RecordBatch", "Table", or "Scalar"
> arrow_unit <- Scalar$create("day")
> (arrow_unit)
> #> Scalar
> #> day
> call_function("round_temporal", arrow_now, unit = arrow_unit)
> #> Error: Invalid: Function 'round_temporal' accepts 1 arguments but 
> attempted to look up kernel(s) with 2
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to