Hi Peter, Echoing Eugene's and JB's thoughts -- we'd love a PR!
I also wanted to say: we've hit you with a lot of recommendations in this email thread. If you have any questions, you can ask us here -- but we'll of course be happy to answer them during code review as well. Do not feel like meeting all these many criteria is a pre-requisite for opening a Pull Request -- we just may give you feedback and ask for changes before merging :). Thanks! Dan On Mon, Mar 14, 2016 at 12:27 PM, Jean-Baptiste Onofré <[email protected]> wrote: > Yes, you already use the "new style" as you use BoundedSource. > > Regards > JB > > > On 03/14/2016 08:08 PM, Giesin, Peter wrote: > >> The MultiLineIO is a BoundedSource and an extension of FileBasedSource. >> Where the FileBasedSource reads a single line at a time the MultiLineIO >> allows the user to define an arbitrary “message” delimiter. It then reads >> through the file, removing newlines, until the separator is read, finally >> returning the character sequence that is built. >> >> >> >> I believe it is already built using the new style but I will compare it >> to the BigTableIO to confirm that. >> >> Peter >> >> On 3/14/16, 1:50 PM, "Jean-Baptiste Onofré" <[email protected]> wrote: >> >> I second Eugene here. >>> >>> In the past, I developed some IOs using the "old style" (as did in the >>> PubSubIO). I'm now refactoring it to use the "new style". >>> >>> Regards >>> JB >>> >>> On 03/14/2016 06:47 PM, Eugene Kirpichov wrote: >>> >>>> Hi Peter, >>>> Looking forward to your PR. Please note that source classes are >>>> relatively >>>> tricky to develop, so would you mind briefly explaining what your source >>>> will do here over email, so that we hash out some possible issues early >>>> rather than in PR comments? >>>> Also note that now recommend to package IO connectors as PTransforms, >>>> making the PTransform class itself be a builder - while the Source/Sink >>>> classes should be kept package-private (rather than exposed to the >>>> user). >>>> For an example of a connector packaged in this style, see BigtableIO ( >>>> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GoogleCloudPlatform_DataflowJavaSDK_blob_master_sdk_src_main_java_com_google_cloud_dataflow_sdk_io_bigtable_BigtableIO.java&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=qJJMaoRlOHxy1MRcAwa7aIJxwGYJyUKL93FdO4jZr1I&e= >>>> ). >>>> The advantage is that this style allows you to restructure the >>>> connector or >>>> add additional transforms into its implementation if necessary, without >>>> changing the call sites. It might seem less important in case of a >>>> simple >>>> connector like reading lines from file, but it will become much more >>>> important with things like SplittableDoFn >>>> < >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_BEAM-2D65&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=POJMhWDTbkUnHHLnKcH9FtzeP-lrZkuGZG3YPNNhXSU&e= >>>> >. >>>> >>>> On Mon, Mar 14, 2016 at 10:29 AM Jean-Baptiste Onofré <[email protected]> >>>> wrote: >>>> >>>> Hi Peter, >>>>> >>>>> awesome ! >>>>> >>>>> Yes, you can create the PR using the github mirror. >>>>> >>>>> Does your MultiLineIO use Bounded/Unbounded "new" classes ? >>>>> >>>>> Regards >>>>> JB >>>>> >>>>> On 03/14/2016 06:23 PM, Giesin, Peter wrote: >>>>> >>>>>> Hi all! >>>>>> >>>>>> I am looking to get involved in the project. I have a MultiLineIO >>>>>> >>>>> file-based source that I think would be useful. I know the project is >>>>> just >>>>> spinning up but can I simply clone the repo and create a PR for the >>>>> new IO? >>>>> Also looked over JIRA and there are some tickets I can help out with. >>>>> >>>>>> >>>>>> Best regards, >>>>>> Peter Giesin >>>>>> [email protected] >>>>>> >>>>>> >>>>>> _____________ >>>>>> The information contained in this message is proprietary and/or >>>>>> >>>>> confidential. If you are not the intended recipient, please: (i) >>>>> delete the >>>>> message and all copies; (ii) do not disclose, distribute or use the >>>>> message >>>>> in any manner; and (iii) notify the sender immediately. In addition, >>>>> please >>>>> be aware that any message addressed to our domain is subject to >>>>> archiving >>>>> and review by persons other than the intended recipient. Thank you. >>>>> >>>>>> >>>>>> >>>>> -- >>>>> Jean-Baptiste Onofré >>>>> [email protected] >>>>> >>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.nanthrax.net&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=f6FNnwvFjzBZnAIvDfndYuU_lAso931YU4yr4oSnypE&e= >>>>> Talend - >>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.talend.com&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=LtKQ-yfpvERysYJvdj3EP_VPA47BuNVkJ6hqfIW1RQM&e= >>>>> >>>>> >>>> >>> -- >>> Jean-Baptiste Onofré >>> [email protected] >>> >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.nanthrax.net&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=f6FNnwvFjzBZnAIvDfndYuU_lAso931YU4yr4oSnypE&e= >>> Talend - >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.talend.com&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=LtKQ-yfpvERysYJvdj3EP_VPA47BuNVkJ6hqfIW1RQM&e= >>> >>> _____________ >>> The information contained in this message is proprietary and/or >>> confidential. If you are not the intended recipient, please: (i) delete the >>> message and all copies; (ii) do not disclose, distribute or use the message >>> in any manner; and (iii) notify the sender immediately. In addition, please >>> be aware that any message addressed to our domain is subject to archiving >>> and review by persons other than the intended recipient. Thank you. >>> >> >> _____________ >> The information contained in this message is proprietary and/or >> confidential. If you are not the intended recipient, please: (i) delete the >> message and all copies; (ii) do not disclose, distribute or use the message >> in any manner; and (iii) notify the sender immediately. In addition, please >> be aware that any message addressed to our domain is subject to archiving >> and review by persons other than the intended recipient. Thank you. >> >> > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
