[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2023-03-05 Thread via GitHub
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1455360592 @hvanhovell @grundprinzip @HyukjinKwon @zhengruifeng @amaliujia Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2023-03-05 Thread via GitHub
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1455279364 > @beliefer can you please remove the is_observation code path? And take another look at the protocol. Otherwise I think it looks good. is_observation code path has been removed.

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2023-02-13 Thread via GitHub
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1428966021 > @beliefer will take a look today. Thanks for your hard work and patience! Thank you. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2023-02-12 Thread via GitHub
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1427479417 ping @hvanhovell Could you review again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2023-01-15 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1383316479 ping @hvanhovell -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2023-01-09 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1376762870 It seems the failure is unrelated with this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2023-01-08 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1375067246 > In particular, the discussion on the `isObservation` flag in the proto message needs to be addressed to simplify. Hi, @grundprinzip . In fact, I removed the `Observation` that

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2023-01-05 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1371935451 ping @hvanhovell @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2022-12-31 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1368190192 ping @hvanhovell @grundprinzip @zhengruifeng @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2022-12-27 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1365847958 ping @grundprinzip -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2022-12-20 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1360966070 ping @hvanhovell @grundprinzip @zhengruifeng @amaliujia -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2022-12-20 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1359204590 > I don't have enough experience if it's worth it to do another full round trip to the server for that. Can we experiment for now in just immediately returning them? The observed

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2022-12-20 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1359204069 > @beliefer can we just send them as part of the `ExecutePlanResponse` at the end of the query? Doing another RPC seems a bit wasteful, and it means we have to track query state in the

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement `DataFrame.observe`

2022-12-19 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1357520476 > I think it would be possible to add another result batch type for observed metrics and simply pass them at the end. I have an idea: 1. cache the `Observation` at server.

[GitHub] [spark] beliefer commented on pull request #39091: [SPARK-41527][CONNECT][PYTHON] Implement DataFrame.observe

2022-12-16 Thread GitBox
beliefer commented on PR #39091: URL: https://github.com/apache/spark/pull/39091#issuecomment-1356003824 > @beliefer thanks for working on this. I have one question how are we going to get the observed metrics to the client? This seems to be missing from the implementation. One of the