Hi Mich, The title of this thread is "[DISCUSS]". We need to have a public discussion on a SPIP proposal collecting comments before we can move forward to call for a vote on it.
On Mon, Feb 13, 2023 at 2:35 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > I thought we already voted to go ahead with this proposal! > > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Mon, 13 Feb 2023 at 20:41, kazuyuki tanimura <ktanim...@apple.com> > wrote: > >> Thank you Liang-Chi! >> >> Kazu >> >> On Feb 11, 2023, at 7:12 PM, L. C. Hsieh <vii...@gmail.com> wrote: >> >> Thanks all for your feedback. >> >> Given this positive feedback, if there is no other comments/discussion, I >> will go to start a vote in the next few days. >> >> Thank you again! >> >> On Thu, Feb 2, 2023 at 10:12 AM kazuyuki tanimura < >> ktanim...@apple.com.invalid> wrote: >> >>> Thank you all for +1s and reviewing the SPIP doc. >>> >>> Kazu >>> >>> On Feb 1, 2023, at 1:28 AM, Dongjoon Hyun <dongjoon.h...@gmail.com> >>> wrote: >>> >>> +1 >>> >>> On Wed, Feb 1, 2023 at 12:52 AM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> +1 >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly >>>> disclaimed. The author will in no case be liable for any monetary damages >>>> arising from such loss, damage or destruction. >>>> >>>> >>>> >>>> >>>> On Wed, 1 Feb 2023 at 02:23, huaxin gao <huaxin.ga...@gmail.com> wrote: >>>> >>>>> +1 >>>>> >>>>> On Tue, Jan 31, 2023 at 6:10 PM DB Tsai <dbt...@dbtsai.com> wrote: >>>>> >>>>>> +1 >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>> On Jan 31, 2023, at 4:16 PM, Yuming Wang <wgy...@gmail.com> wrote: >>>>>> >>>>>> >>>>>> +1. >>>>>> >>>>>> On Wed, Feb 1, 2023 at 7:42 AM kazuyuki tanimura < >>>>>> ktanim...@apple.com.invalid> wrote: >>>>>> >>>>>>> Great! Much appreciated, Mitch! >>>>>>> >>>>>>> Kazu >>>>>>> >>>>>>> On Jan 31, 2023, at 3:07 PM, Mich Talebzadeh < >>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>> >>>>>>> Thanks, Kazu. >>>>>>> >>>>>>> I followed that template link and indeed as you pointed out it is a >>>>>>> common template. If it works then it is what it is. >>>>>>> >>>>>>> I will be going through your design proposals and hopefully we can >>>>>>> review it. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Mich >>>>>>> >>>>>>> >>>>>>> view my Linkedin profile >>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>> >>>>>>> >>>>>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>>>>> >>>>>>> >>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>> for any loss, damage or destruction of data or any other property which >>>>>>> may >>>>>>> arise from relying on this email's technical content is explicitly >>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>> damages >>>>>>> arising from such loss, damage or destruction. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, 31 Jan 2023 at 22:34, kazuyuki tanimura <ktanim...@apple.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Thank you Mich. I followed the instruction at >>>>>>>> https://spark.apache.org/improvement-proposals.html and used its >>>>>>>> template. >>>>>>>> While we are open to revise our design doc, it seems more like you >>>>>>>> are proposing the community to change the instruction per se? >>>>>>>> >>>>>>>> Kazu >>>>>>>> >>>>>>>> On Jan 31, 2023, at 11:24 AM, Mich Talebzadeh < >>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Thanks for these proposals. good suggestions. Is this style of >>>>>>>> breaking down your approach standard? >>>>>>>> >>>>>>>> My view would be that perhaps it makes more sense to follow the >>>>>>>> industry established approach of breaking down >>>>>>>> your technical proposal into: >>>>>>>> >>>>>>>> >>>>>>>> 1. Background >>>>>>>> 2. Objective >>>>>>>> 3. Scope >>>>>>>> 4. Constraints >>>>>>>> 5. Assumptions >>>>>>>> 6. Reporting >>>>>>>> 7. Deliverables >>>>>>>> 8. Timelines >>>>>>>> 9. Appendix >>>>>>>> >>>>>>>> Your current approach using below >>>>>>>> >>>>>>>> Q1. What are you trying to do? Articulate your objectives using >>>>>>>> absolutely no jargon. What are you trying to achieve? >>>>>>>> Q2. What problem is this proposal NOT designed to solve? What >>>>>>>> issues the suggested proposal is not going to address >>>>>>>> Q3. How is it done today, and what are the limits of current >>>>>>>> practice? >>>>>>>> Q4. What is new in your approach approach and why do you think it >>>>>>>> will be successful succeed? >>>>>>>> Q5. Who cares? If you are successful, what difference will it make? >>>>>>>> If your proposal succeeds, what tangible benefits will it add? >>>>>>>> Q6. What are the risks? >>>>>>>> Q7. How long will it take? >>>>>>>> Q8. What are the midterm and final “exams” to check for success? >>>>>>>> >>>>>>>> >>>>>>>> May not do justice to your proposal. >>>>>>>> >>>>>>>> HTH >>>>>>>> >>>>>>>> Mich >>>>>>>> >>>>>>>> view my Linkedin profile >>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>> >>>>>>>> >>>>>>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>>>>>> >>>>>>>> >>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>>> for any loss, damage or destruction of data or any other property >>>>>>>> which may >>>>>>>> arise from relying on this email's technical content is explicitly >>>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>>> damages >>>>>>>> arising from such loss, damage or destruction. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, 31 Jan 2023 at 17:35, kazuyuki tanimura < >>>>>>>> ktanim...@apple.com.invalid> wrote: >>>>>>>> >>>>>>>>> Hi everyone, >>>>>>>>> >>>>>>>>> I would like to start a discussion on “Lazy Materialization for >>>>>>>>> Parquet Read Performance Improvement" >>>>>>>>> >>>>>>>>> Chao and I propose a Parquet reader with lazy materialization. For >>>>>>>>> Spark-SQL filter operations, evaluating the filters first and >>>>>>>>> lazily materializing only the used values can save computation wastes >>>>>>>>> and >>>>>>>>> improve the read performance. >>>>>>>>> The current implementation of Spark requires the read values to >>>>>>>>> materialize (i.e. decompress, de-code, etc...) onto memory first >>>>>>>>> before >>>>>>>>> applying the filters even though the filters may eventually throw >>>>>>>>> away many >>>>>>>>> values. >>>>>>>>> >>>>>>>>> We made our design doc as follows. >>>>>>>>> SPIP Jira: https://issues.apache.org/jira/browse/SPARK-42256 >>>>>>>>> SPIP Doc: >>>>>>>>> https://docs.google.com/document/d/1Kr3y2fVZUbQXGH0y8AvdCAeWC49QJjpczapiaDvFzME >>>>>>>>> >>>>>>>>> Liang-Chi was kind enough to shepherd this effort. >>>>>>>>> >>>>>>>>> Thank you >>>>>>>>> Kazu >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>> >>