Thank you all for +1s and reviewing the SPIP doc. Kazu
> On Feb 1, 2023, at 1:28 AM, Dongjoon Hyun <[email protected]> wrote: > > +1 > > On Wed, Feb 1, 2023 at 12:52 AM Mich Talebzadeh <[email protected] > <mailto:[email protected]>> wrote: > +1 > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > https://en.everybodywiki.com/Mich_Talebzadeh > <https://en.everybodywiki.com/Mich_Talebzadeh> > > Disclaimer: Use it at your own risk. Any and all responsibility for any loss, > damage or destruction of data or any other property which may arise from > relying on this email's technical content is explicitly disclaimed. The > author will in no case be liable for any monetary damages arising from such > loss, damage or destruction. > > > > On Wed, 1 Feb 2023 at 02:23, huaxin gao <[email protected] > <mailto:[email protected]>> wrote: > +1 > > On Tue, Jan 31, 2023 at 6:10 PM DB Tsai <[email protected] > <mailto:[email protected]>> wrote: > +1 > > Sent from my iPhone > >> On Jan 31, 2023, at 4:16 PM, Yuming Wang <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> +1. >> >> On Wed, Feb 1, 2023 at 7:42 AM kazuyuki tanimura >> <[email protected]> wrote: >> Great! Much appreciated, Mitch! >> >> Kazu >> >>> On Jan 31, 2023, at 3:07 PM, Mich Talebzadeh <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Thanks, Kazu. >>> >>> I followed that template link and indeed as you pointed out it is a common >>> template. If it works then it is what it is. >>> >>> I will be going through your design proposals and hopefully we can review >>> it. >>> >>> Regards, >>> >>> Mich >>> >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> <https://en.everybodywiki.com/Mich_Talebzadeh> >>> >>> Disclaimer: Use it at your own risk. Any and all responsibility for any >>> loss, damage or destruction of data or any other property which may arise >>> from relying on this email's technical content is explicitly disclaimed. >>> The author will in no case be liable for any monetary damages arising from >>> such loss, damage or destruction. >>> >>> >>> >>> On Tue, 31 Jan 2023 at 22:34, kazuyuki tanimura <[email protected] >>> <mailto:[email protected]>> wrote: >>> Thank you Mich. I followed the instruction at >>> https://spark.apache.org/improvement-proposals.html >>> <https://spark.apache.org/improvement-proposals.html> and used its template. >>> While we are open to revise our design doc, it seems more like you are >>> proposing the community to change the instruction per se? >>> >>> Kazu >>> >>>> On Jan 31, 2023, at 11:24 AM, Mich Talebzadeh <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> Hi, >>>> >>>> Thanks for these proposals. good suggestions. Is this style of breaking >>>> down your approach standard? >>>> >>>> My view would be that perhaps it makes more sense to follow the industry >>>> established approach of breaking down your technical proposal into: >>>> >>>> Background >>>> Objective >>>> Scope >>>> Constraints >>>> Assumptions >>>> Reporting >>>> Deliverables >>>> Timelines >>>> Appendix >>>> Your current approach using below >>>> >>>> Q1. What are you trying to do? Articulate your objectives using absolutely >>>> no jargon. What are you trying to achieve? >>>> Q2. What problem is this proposal NOT designed to solve? What issues the >>>> suggested proposal is not going to address >>>> Q3. How is it done today, and what are the limits of current practice? >>>> Q4. What is new in your approach approach and why do you think it will be >>>> successful succeed? >>>> Q5. Who cares? If you are successful, what difference will it make? If >>>> your proposal succeeds, what tangible benefits will it add? >>>> Q6. What are the risks? >>>> Q7. How long will it take? >>>> Q8. What are the midterm and final “exams” to check for success? >>>> >>>> May not do justice to your proposal. >>>> >>>> HTH >>>> >>>> Mich >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>> <https://en.everybodywiki.com/Mich_Talebzadeh> >>>> >>>> Disclaimer: Use it at your own risk. Any and all responsibility for any >>>> loss, damage or destruction of data or any other property which may arise >>>> from relying on this email's technical content is explicitly disclaimed. >>>> The author will in no case be liable for any monetary damages arising from >>>> such loss, damage or destruction. >>>> >>>> >>>> >>>> On Tue, 31 Jan 2023 at 17:35, kazuyuki tanimura >>>> <[email protected] <mailto:[email protected]>> wrote: >>>> Hi everyone, >>>> >>>> I would like to start a discussion on “Lazy Materialization for Parquet >>>> Read Performance Improvement" >>>> >>>> Chao and I propose a Parquet reader with lazy materialization. For >>>> Spark-SQL filter operations, evaluating the filters first and lazily >>>> materializing only the used values can save computation wastes and improve >>>> the read performance. >>>> The current implementation of Spark requires the read values to >>>> materialize (i.e. decompress, de-code, etc...) onto memory first before >>>> applying the filters even though the filters may eventually throw away >>>> many values. >>>> >>>> We made our design doc as follows. >>>> SPIP Jira: https://issues.apache.org/jira/browse/SPARK-42256 >>>> <https://issues.apache.org/jira/browse/SPARK-42256> >>>> SPIP Doc: >>>> https://docs.google.com/document/d/1Kr3y2fVZUbQXGH0y8AvdCAeWC49QJjpczapiaDvFzME >>>> >>>> <https://docs.google.com/document/d/1Kr3y2fVZUbQXGH0y8AvdCAeWC49QJjpczapiaDvFzME> >>>> >>>> Liang-Chi was kind enough to shepherd this effort. >>>> >>>> Thank you >>>> Kazu >>> >>
