Re: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS Files

2017-08-07 Thread Raja . Aravapalli
Thanks very much for the pointers Vinay. That helps ☺


-Raja.

From: vinay patil <vinay18.pa...@gmail.com>
Date: Monday, August 7, 2017 at 1:56 AM
To: "user@flink.apache.org" <user@flink.apache.org>
Subject: Re: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS 
Files

Hi Raja,

That is why they are in the pending state. You can enable checkpointing by 
setting env.enableCheckpointing()

After doing this they will not remain in pending state.

Check this out : 
https://ci.apache.org/projects/flink/flink-docs-release-1.3/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html

Regards,
Vinay Patil

On Mon, Aug 7, 2017 at 9:15 AM, Raja.Aravapalli [via Apache Flink User Mailing 
List archive.] <[hidden 
email]> wrote:
Hi Vinay,

Thanks for the response.

I have NOT enabled any checkpointing.

Files are rolling out correctly for every 2mb, but the files are remaining as 
below:

-rw-r--r--   3 2097424 2017-08-06 21:10 ////Test/part-0-0.pending
-rw-r--r--   3 1431430 2017-08-06 21:12 ////Test/part-0-1.pending


Regards,
Raja.

From: vinay patil <[hidden 
email]<http://user/SendEmail.jtp?type=node=14716=0>>
Date: Sunday, August 6, 2017 at 10:40 PM
To: "[hidden email]<http://user/SendEmail.jtp?type=node=14716=1>" 
<[hidden email]<http://user/SendEmail.jtp?type=node=14716=2>>
Subject: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS 
Files

Hi Raja,

Have you enabled checkpointing?
The files will be rolled to complete state when the batch size is reached (in 
your case 2 MB) or when the bucket is inactive for a certain amount of time.

Regards,
Vinay Patil

On Mon, Aug 7, 2017 at 7:53 AM, Raja.Aravapalli [via Apache Flink User Mailing 
List archive.] <[hidden email]> wrote:

Hi,

I am working on a poc to write to hdfs files using BucketingSink class. Even 
thought I am the data is being writing to hdfs files, but the files are lying 
with “.pending” on hdfs.


Below is the code I am using. Can someone pls help me identify the issue and 
help me fix this ?


BucketingSink HdfsSink = new 
BucketingSink("hdfs://///Test/");
HdfsSink.setBucketer(new DateTimeBucketer("-MM-dd--HHmm"));
HdfsSink.setBatchSize(1024 * 1024 * 2); // this is 2 MB,
HdfsSink.setInactiveBucketCheckInterval(1L);
HdfsSink.setInactiveBucketThreshold(1L);


Thanks a lot.


Regards,
Raja.


If you reply to this email, your message will be added to the discussion below:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714.html
To start a new topic under Apache Flink User Mailing List archive., email 
[hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>



View this message in context: Re: Help required - "BucketingSink" usage to 
write HDFS 
Files<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14715.html>
Sent from the Apache Flink User Mailing List archive. mailing list 
archive<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> 
at Nabble.com.


If you reply to this email, your message will be added to the discussion below:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14716.html
To start a new topic under Apache Flink User Mailing List archive., email 
[hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>



View this message in context: Re: [EXTERNAL] Re: Help required - 
"BucketingSink" usage to write HDFS 
Files<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14717.html>
Sent from the Apache Flink User Mailing List archive. mailing list 
archive<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> 
at Nabble.com.



Re: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS Files

2017-08-07 Thread vinay patil
Hi Raja,

That is why they are in the pending state. You can enable checkpointing by
setting env.enableCheckpointing()

After doing this they will not remain in pending state.

Check this out :
https://ci.apache.org/projects/flink/flink-docs-release-1.3/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html

Regards,
Vinay Patil

On Mon, Aug 7, 2017 at 9:15 AM, Raja.Aravapalli [via Apache Flink User
Mailing List archive.] <ml+s2336050n14716...@n4.nabble.com> wrote:

> Hi Vinay,
>
>
>
> Thanks for the response.
>
>
>
> I have NOT enabled any checkpointing.
>
>
>
> Files are rolling out correctly for every 2mb, but the files are remaining
> as below:
>
>
>
> -rw-r--r--   3 2097424 2017-08-06 21:10 *///*/Test/part-0-0.
> pending
>
> -rw-r--r--   3 1431430 2017-08-06 21:12 *///*/Test/part-0-1.
> pending
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
> *From: *vinay patil <[hidden email]
> <http:///user/SendEmail.jtp?type=node=14716=0>>
> *Date: *Sunday, August 6, 2017 at 10:40 PM
> *To: *"[hidden email]
> <http:///user/SendEmail.jtp?type=node=14716=1>" <[hidden email]
> <http:///user/SendEmail.jtp?type=node=14716=2>>
> *Subject: *[EXTERNAL] Re: Help required - "BucketingSink" usage to write
> HDFS Files
>
>
>
> Hi Raja,
>
> Have you enabled checkpointing?
>
> The files will be rolled to complete state when the batch size is reached
> (in your case 2 MB) or when the bucket is inactive for a certain amount of
> time.
>
>
> Regards,
>
> Vinay Patil
>
>
>
> On Mon, Aug 7, 2017 at 7:53 AM, Raja.Aravapalli [via Apache Flink User
> Mailing List archive.] <[hidden email]> wrote:
>
>
>
> Hi,
>
>
>
> I am working on a poc to write to hdfs files using BucketingSink class.
> Even thought I am the data is being writing to hdfs files, but the files
> are lying with “.pending” on hdfs.
>
>
>
>
>
> Below is the code I am using. Can someone pls help me identify the issue
> and help me fix this ?
>
>
>
>
>
> BucketingSink HdfsSink = *new *BucketingSink(
> *"hdfs://///Test/"*);
>
>
>
> *HdfsSink.setBucketer(new DateTimeBucketer("-MM-dd--HHmm"));
> HdfsSink.setBatchSize(1024 * 1024 * 2); // this is 2 MB,
> HdfsSink.setInactiveBucketCheckInterval(1L);
> HdfsSink.setInactiveBucketThreshold(1L);*
>
>
>
>
>
> Thanks a lot.
>
>
>
>
>
> Regards,
>
> Raja.
>
>
> --
>
> *If you reply to this email, your message will be added to the discussion
> below:*
>
> http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/Help-required-BucketingSink-usage-to-write-
> HDFS-Files-tp14714.html
>
> To start a new topic under Apache Flink User Mailing List archive., email 
> [hidden
> email]
> To unsubscribe from Apache Flink User Mailing List archive., click here.
> NAML
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
>
> --
>
> View this message in context: Re: Help required - "BucketingSink" usage
> to write HDFS Files
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14715.html>
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> at
> Nabble.com.
>
>
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/Help-required-BucketingSink-usage-to-write-
> HDFS-Files-tp14714p14716.html
> To start a new topic under Apache Flink User Mailing List archive., email
> ml+s2336050n1...@n4.nabble.com
> To unsubscribe from Apache Flink User Mailing List archive., click here
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=dmluYXkxOC5wYXRpbEBnbWFpbC5jb218MXwxODExMDE2NjAx>
> .
> NAML
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14717.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.

Re: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS Files

2017-08-06 Thread Raja . Aravapalli
Hi Vinay,

Thanks for the response.

I have NOT enabled any checkpointing.

Files are rolling out correctly for every 2mb, but the files are remaining as 
below:

-rw-r--r--   3 2097424 2017-08-06 21:10 ////Test/part-0-0.pending
-rw-r--r--   3 1431430 2017-08-06 21:12 ////Test/part-0-1.pending


Regards,
Raja.

From: vinay patil <vinay18.pa...@gmail.com>
Date: Sunday, August 6, 2017 at 10:40 PM
To: "user@flink.apache.org" <user@flink.apache.org>
Subject: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS 
Files

Hi Raja,

Have you enabled checkpointing?
The files will be rolled to complete state when the batch size is reached (in 
your case 2 MB) or when the bucket is inactive for a certain amount of time.

Regards,
Vinay Patil

On Mon, Aug 7, 2017 at 7:53 AM, Raja.Aravapalli [via Apache Flink User Mailing 
List archive.] <[hidden 
email]> wrote:

Hi,

I am working on a poc to write to hdfs files using BucketingSink class. Even 
thought I am the data is being writing to hdfs files, but the files are lying 
with “.pending” on hdfs.


Below is the code I am using. Can someone pls help me identify the issue and 
help me fix this ?


BucketingSink HdfsSink = new 
BucketingSink("hdfs://///Test/");
HdfsSink.setBucketer(new DateTimeBucketer("-MM-dd--HHmm"));
HdfsSink.setBatchSize(1024 * 1024 * 2); // this is 2 MB,
HdfsSink.setInactiveBucketCheckInterval(1L);
HdfsSink.setInactiveBucketThreshold(1L);


Thanks a lot.


Regards,
Raja.


If you reply to this email, your message will be added to the discussion below:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714.html
To start a new topic under Apache Flink User Mailing List archive., email 
[hidden email]
To unsubscribe from Apache Flink User Mailing List archive., click here.
NAML<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>



View this message in context: Re: Help required - "BucketingSink" usage to 
write HDFS 
Files<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Help-required-BucketingSink-usage-to-write-HDFS-Files-tp14714p14715.html>
Sent from the Apache Flink User Mailing List archive. mailing list 
archive<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> 
at Nabble.com.