Yes, you already use the "new style" as you use BoundedSource.
Regards
JB
On 03/14/2016 08:08 PM, Giesin, Peter wrote:
The MultiLineIO is a BoundedSource and an extension of FileBasedSource. Where
the FileBasedSource reads a single line at a time the MultiLineIO allows the
user to define an arbitrary “message” delimiter. It then reads through the
file, removing newlines, until the separator is read, finally returning the
character sequence that is built.
I believe it is already built using the new style but I will compare it to the
BigTableIO to confirm that.
Peter
On 3/14/16, 1:50 PM, "Jean-Baptiste Onofré" <[email protected]> wrote:
I second Eugene here.
In the past, I developed some IOs using the "old style" (as did in the
PubSubIO). I'm now refactoring it to use the "new style".
Regards
JB
On 03/14/2016 06:47 PM, Eugene Kirpichov wrote:
Hi Peter,
Looking forward to your PR. Please note that source classes are relatively
tricky to develop, so would you mind briefly explaining what your source
will do here over email, so that we hash out some possible issues early
rather than in PR comments?
Also note that now recommend to package IO connectors as PTransforms,
making the PTransform class itself be a builder - while the Source/Sink
classes should be kept package-private (rather than exposed to the user).
For an example of a connector packaged in this style, see BigtableIO (
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GoogleCloudPlatform_DataflowJavaSDK_blob_master_sdk_src_main_java_com_google_cloud_dataflow_sdk_io_bigtable_BigtableIO.java&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=qJJMaoRlOHxy1MRcAwa7aIJxwGYJyUKL93FdO4jZr1I&e=
).
The advantage is that this style allows you to restructure the connector or
add additional transforms into its implementation if necessary, without
changing the call sites. It might seem less important in case of a simple
connector like reading lines from file, but it will become much more
important with things like SplittableDoFn
<https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_BEAM-2D65&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=POJMhWDTbkUnHHLnKcH9FtzeP-lrZkuGZG3YPNNhXSU&e=
>.
On Mon, Mar 14, 2016 at 10:29 AM Jean-Baptiste Onofré <[email protected]>
wrote:
Hi Peter,
awesome !
Yes, you can create the PR using the github mirror.
Does your MultiLineIO use Bounded/Unbounded "new" classes ?
Regards
JB
On 03/14/2016 06:23 PM, Giesin, Peter wrote:
Hi all!
I am looking to get involved in the project. I have a MultiLineIO
file-based source that I think would be useful. I know the project is just
spinning up but can I simply clone the repo and create a PR for the new IO?
Also looked over JIRA and there are some tickets I can help out with.
Best regards,
Peter Giesin
[email protected]
_____________
The information contained in this message is proprietary and/or
confidential. If you are not the intended recipient, please: (i) delete the
message and all copies; (ii) do not disclose, distribute or use the message
in any manner; and (iii) notify the sender immediately. In addition, please
be aware that any message addressed to our domain is subject to archiving
and review by persons other than the intended recipient. Thank you.
--
Jean-Baptiste Onofré
[email protected]
https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.nanthrax.net&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=f6FNnwvFjzBZnAIvDfndYuU_lAso931YU4yr4oSnypE&e=
Talend -
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.talend.com&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=LtKQ-yfpvERysYJvdj3EP_VPA47BuNVkJ6hqfIW1RQM&e=
--
Jean-Baptiste Onofré
[email protected]
https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.nanthrax.net&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=f6FNnwvFjzBZnAIvDfndYuU_lAso931YU4yr4oSnypE&e=
Talend -
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.talend.com&d=BQIDaQ&c=3BfiSO86x5iKjpl2b39jud9R1NrKYqPq2js90dwBswk&r=Qm-l_hW9ETnsf6X4GnnKezFfnAEwc328ni8ljHdGYjo&m=spZLCFrFYTtUSPsGFMTVvmXPyfW-dr7Uouq-4BtWaPQ&s=LtKQ-yfpvERysYJvdj3EP_VPA47BuNVkJ6hqfIW1RQM&e=
_____________
The information contained in this message is proprietary and/or confidential.
If you are not the intended recipient, please: (i) delete the message and all
copies; (ii) do not disclose, distribute or use the message in any manner; and
(iii) notify the sender immediately. In addition, please be aware that any
message addressed to our domain is subject to archiving and review by persons
other than the intended recipient. Thank you.
_____________
The information contained in this message is proprietary and/or confidential.
If you are not the intended recipient, please: (i) delete the message and all
copies; (ii) do not disclose, distribute or use the message in any manner; and
(iii) notify the sender immediately. In addition, please be aware that any
message addressed to our domain is subject to archiving and review by persons
other than the intended recipient. Thank you.
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com