Re: Enhanced Console Sink for Structured Streaming

2024-03-12 Thread Neil Ramaswamy
For advanced users, it's certainly an option to look at the streaming query progress and use the state store reader to look at your state. However, the goal of this Enhanced Console Sink is to improve the experience for *new *users, i.e. it should work mostly out of the box. Let's move discussion

Re: Enhanced Console Sink for Structured Streaming

2024-03-12 Thread Mich Talebzadeh
OK I have just been working on a Databricks engineering question raised by a user Monitoring structure streaming in external sink In practice there is an option to use

Re: Enhanced Console Sink for Structured Streaming

2024-02-09 Thread Neil Ramaswamy
Thanks for the comments, Anish and Jerry. To summarize so far, we are in agreement that: 1. Enhanced console sink is a good tool for new users to understand Structured Streaming semantics 2. It should be opt-in via an option (unlike my original proposal) 3. Out of the 2 modes of verbosity I

Re: Enhanced Console Sink for Structured Streaming

2024-02-08 Thread Jerry Peng
I am generally a +1 on this as we can use this information in our docs to demonstrate certains concepts to potential users. I am in agreement with other reviewers that we should keep the existing default behavior of the console sink. This new style of output should be enabled behind a flag. As

Re: Enhanced Console Sink for Structured Streaming

2024-02-08 Thread Anish Shrigondekar
Hi Neil, Thanks for putting this together. +1 to the proposal of enhancing the console sink further. I think it will help new users understand some of the streaming/micro-batch semantics a bit better in Spark. Agree with not having verbose mode enabled by default. I think step 1 described above

Re: Enhanced Console Sink for Structured Streaming

2024-02-06 Thread Neil Ramaswamy
Jungtaek and Raghu, thanks for the input. I'm happy with the verbose mode being off by default. I think it's reasonable to have 1 or 2 levels of verbosity: 1. The first verbose mode could target new users, and take a highly opinionated view on what's important to understand streaming

Re: Enhanced Console Sink for Structured Streaming

2024-02-05 Thread Mich Talebzadeh
I don't think adding this to the streaming flow (at micro level) will be that useful However, this can be added to Spark UI as an enhancement to the Streaming Query Statistics page. HTH Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my

Re: Enhanced Console Sink for Structured Streaming

2024-02-05 Thread Raghu Angadi
Agree, the default behavior does not need to change. Neil, how about separating it into two sections: - Actual rows in the sink (same as current output) - Followed by metadata data

Re: Enhanced Console Sink for Structured Streaming

2024-02-05 Thread Jungtaek Lim
Maybe we could keep the default as it is, and explicitly turn on verboseMode to enable auxiliary information. I'm not a believer that anyone will parse the output of console sink (which means this could be a breaking change), but changing the default behavior should be taken conservatively. We can

Re: Enhanced Console Sink for Structured Streaming

2024-02-03 Thread Neil Ramaswamy
Re: verbosity: yes, it will be more verbose. A config I was planning to implement was a default-on console sink option, verboseMode, that you can set to be off if you just want sink data. I don't think that introduces additional complexity, as the last point suggests. (And also, nobody should be

Re: Enhanced Console Sink for Structured Streaming

2024-02-03 Thread Mich Talebzadeh
Hi, As I understood, the proposal you mentioned suggests adding event-time and state store metadata to the console sink to better highlight the semantics of the Structured Streaming engine. While I agree this enhancement can provide valuable insights into the engine's behavior especially for