[
https://issues.apache.org/jira/browse/BEAM-12169?focusedWorklogId=724170&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-724170
]
ASF GitHub Bot logged work on BEAM-12169:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 09/Feb/22 23:48
Start Date: 09/Feb/22 23:48
Worklog Time Spent: 10m
Work Description: codecov[bot] edited a comment on pull request #16615:
URL: https://github.com/apache/beam/pull/16615#issuecomment-1026404532
#
[Codecov](https://codecov.io/gh/apache/beam/pull/16615?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
Report
> Merging
[#16615](https://codecov.io/gh/apache/beam/pull/16615?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
(38aadab) into
[master](https://codecov.io/gh/apache/beam/commit/640e6c245a89581578da68b26d6e27b55deb6a80?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
(640e6c2) will **increase** coverage by `8.98%`.
> The diff coverage is `100.00%`.
> :exclamation: Current head 38aadab differs from pull request most recent
head 53cfde6. Consider uploading reports for the commit 53cfde6 to get more
accurate results
[](https://codecov.io/gh/apache/beam/pull/16615?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #16615 +/- ##
==========================================
+ Coverage 74.63% 83.61% +8.98%
==========================================
Files 653 452 -201
Lines 81876 62105 -19771
==========================================
- Hits 61105 51927 -9178
+ Misses 19785 10178 -9607
+ Partials 986 0 -986
```
| [Impacted
Files](https://codecov.io/gh/apache/beam/pull/16615?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
| Coverage Δ | |
|---|---|---|
|
[sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==)
| `94.96% <100.00%> (+0.02%)` | :arrow_up: |
|
[...pache\_beam/runners/interactive/interactive\_beam.py](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9iZWFtLnB5)
| `76.41% <0.00%> (-0.95%)` | :arrow_down: |
|
[...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=)
| `85.85% <0.00%> (-0.51%)` | :arrow_down: |
|
[...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==)
| `90.50% <0.00%> (-0.34%)` | :arrow_down: |
|
[...dks/python/apache\_beam/options/pipeline\_options.py](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vb3B0aW9ucy9waXBlbGluZV9vcHRpb25zLnB5)
| `95.27% <0.00%> (ø)` | |
|
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
| `93.51% <0.00%> (ø)` | |
|
[sdks/go/pkg/beam/core/graph/coder/bool.go](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL2Jvb2wuZ28=)
| | |
|
[sdks/go/pkg/beam/core/util/reflectx/call.go](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3V0aWwvcmVmbGVjdHgvY2FsbC5nbw==)
| | |
|
[sdks/go/pkg/beam/core/graph/coder/double.go](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL2RvdWJsZS5nbw==)
| | |
|
[sdks/go/pkg/beam/flatten.go](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9mbGF0dGVuLmdv)
| | |
| ... and [200
more](https://codecov.io/gh/apache/beam/pull/16615/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
| |
------
[Continue to review full report at
Codecov](https://codecov.io/gh/apache/beam/pull/16615?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn
more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by
[Codecov](https://codecov.io/gh/apache/beam/pull/16615?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
Last update
[640e6c2...53cfde6](https://codecov.io/gh/apache/beam/pull/16615?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
Read the [comment
docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 724170)
Time Spent: 2h 50m (was: 2h 40m)
> Allow non-deferred column operations on categorical columns
> -----------------------------------------------------------
>
> Key: BEAM-12169
> URL: https://issues.apache.org/jira/browse/BEAM-12169
> Project: Beam
> Issue Type: Improvement
> Components: dsl-dataframe, sdk-py-core
> Reporter: Brian Hulette
> Assignee: Andy Ye
> Priority: P3
> Labels: dataframe-api
> Time Spent: 2h 50m
> Remaining Estimate: 0h
>
> There are several operations that we currently disallow because they produce
> a variable set of columns in the output based on the data
> (non-deferred-columns). However, for some dtypes (categorical, boolean) we
> can easily enumerate all the possible values that will be seen at execution
> time, so we can predict the columns that will be seen.
> Note we still can't implement these operations 100% correctly, as pandas will
> typically only create columns for the values that are _observed_, while we'd
> have to create a column for every possible value.
> We should allow these operations in these special cases.
> Operations in this category:
> - DataFrame.unstack (can work if unstacked level is a categorical or boolean
> column)
> - Series.str.get_dummies
> - Series.str.split
> - Series.str.rsplit
> - DataFrame.pivot
> - DataFrame.pivot_table
> - len(GroupBy) and ngroups
> ** if groupers are all categorical _and_ observed=False or all boolean
> ** Note these two may not actually be equivalent in all cases:
> [https://github.com/pandas-dev/pandas/issues/26326]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)