Hi Shiva, If your files are immutable (once the file is placed in a directory, they won't be changed ever afterwards), then the best source to use is spooling directory. If the files are mutable, then avoid spooling directory source as Flume will throw an exception and shut the source down, so you'll have to restart it.
You can put flume on a different server than the one where files reside and have that folder mounted as a local folder via NFS or similar. That isn't an option if you'll mount source folder across the firewall, two networks or an internet. With exec source it's hard to achieve cross-node execution as it will have to execute a real bash command you provide it with on a remote node. If you still achieve it, it will be very slow due to constant SSH negotiation. Either way, I would most definitely recommend to put flume on a same node where the source folder is, or at least closest to the source like in the same network. That way you can minimize influence of network jitters and dropouts to the source. All sources that pull data will fail ungracefully if they encounter an error fetching data and you'll end up restarting flume. If the HDFS is cross-network or across the internet, I would suggest bonding two flumes on both sides of a wire via AvroSink on source node and AvroSource on destination node since they support fundamental things for such harsh transport environment, like serialization, compression, SSL security over a single TCP connection and a need to have only one port open etc. Then, you configure Flume on destination to drain via HdfsSink into the HDFS. On Fri, Oct 2, 2015 at 7:08 AM, Shiva Ram <[email protected]> wrote: > Set files are placed in the remote server[not a hadoop cluster node], > which source type is suitable for collecting these files from remote server > to HDFS using Flume. The initial study on Flume, I came to know source type > "Exec", "Spooling Directory" can be used to collect these file, I want to > know whether Flume service should run the remote server[source system from > where i want to get the data]? Thanks. > > *Thanks & Regards,* > > *Shiva Ram* > *Website: http://datamaking.com <http://datamaking.com>Facebook Page: > www.facebook.com/datamaking <http://www.facebook.com/datamaking>* > > On Fri, Oct 2, 2015 at 10:36 AM, <[email protected]> wrote: > >> Hi! This is the ezmlm program. I'm managing the >> [email protected] mailing list. >> >> Acknowledgment: I have added the address >> >> [email protected] >> >> to the user mailing list. >> >> Welcome to [email protected]! >> >> Please save this message so that you know the address you are >> subscribed under, in case you later want to unsubscribe or change your >> subscription address. >> >> >> --- Administrative commands for the user list --- >> >> I can handle administrative requests automatically. Please >> do not send them to the list address! Instead, send >> your message to the correct command address: >> >> To subscribe to the list, send a message to: >> <[email protected]> >> >> To remove your address from the list, send a message to: >> <[email protected]> >> >> Send mail to the following for info and FAQ for this list: >> <[email protected]> >> <[email protected]> >> >> Similar addresses exist for the digest list: >> <[email protected]> >> <[email protected]> >> >> To get messages 123 through 145 (a maximum of 100 per request), mail: >> <[email protected]> >> >> To get an index with subject and author for messages 123-456 , mail: >> <[email protected]> >> >> They are always returned as sets of 100, max 2000 per request, >> so you'll actually get 100-499. >> >> To receive all messages with the same subject as message 12345, >> send a short message to: >> <[email protected]> >> >> The messages should contain one line or word of text to avoid being >> treated as sp@m, but I will ignore their content. >> Only the ADDRESS you send to is important. >> >> You can start a subscription for an alternate address, >> for example "[email protected]", just add a hyphen and your >> address (with '=' instead of '@') after the command word: >> <[email protected]> >> >> To stop subscription for this address, mail: >> <[email protected]> >> >> In both cases, I'll send a confirmation message to that address. When >> you receive it, simply reply to it to complete your subscription. >> >> If despite following these instructions, you do not get the >> desired results, please contact my owner at >> [email protected]. Please be patient, my owner is a >> lot slower than I am ;-) >> >> --- Enclosed is a copy of the request I received. >> >> Return-Path: <[email protected]> >> Received: (qmail 43413 invoked by uid 99); 2 Oct 2015 05:06:54 -0000 >> Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) >> by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Oct 2015 05:06:54 >> +0000 >> Received: from localhost (localhost [127.0.0.1]) >> by spamd1-us-west.apache.org (ASF Mail Server at >> spamd1-us-west.apache.org) with ESMTP id A1269C14BD >> for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015= >> [email protected]>; Fri, 2 Oct 2015 05:06:53 +0000 (UTC) >> X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org >> X-Spam-Flag: NO >> X-Spam-Score: 3.131 >> X-Spam-Level: *** >> X-Spam-Status: No, score=3.131 tagged_above=-999 required=6.31 >> tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, >> FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=3, >> RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, >> URIBL_BLOCKED=0.001] >> autolearn=disabled >> Authentication-Results: spamd1-us-west.apache.org (amavisd-new); >> dkim=pass (2048-bit key) header.d=gmail.com >> Received: from mx1-us-east.apache.org ([10.40.0.8]) >> by localhost (spamd1-us-west.apache.org [10.40.0.7]) >> (amavisd-new, port 10024) >> with ESMTP id CjJlyeYvk98Y >> for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015= >> [email protected]>; >> Fri, 2 Oct 2015 05:06:49 +0000 (UTC) >> Received: from mail-ig0-f180.google.com (mail-ig0-f180.google.com >> [209.85.213.180]) >> by mx1-us-east.apache.org (ASF Mail Server at >> mx1-us-east.apache.org) with ESMTPS id D4FBA42B32 >> for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015= >> [email protected]>; Fri, 2 Oct 2015 05:06:48 +0000 (UTC) >> Received: by igxx6 with SMTP id x6so9676936igx.1 >> for <user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015= >> [email protected]>; Thu, 01 Oct 2015 22:06:42 -0700 (PDT) >> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; >> d=gmail.com; s=20120113; >> >> h=mime-version:in-reply-to:references:date:message-id:subject:from:to >> :content-type; >> bh=W4CNcckri44NbE1Oxr7dX2Sqd3SyZ+fbygPB84QfoW4=; >> >> b=U5ECXsUfh+BabyrKs3fWSkau4ItIQmhGMFojV40mE9Wmd9njMInTSCoHP0tKetDy9W >> >> 3wOkHIUKhlcJN1V8Q2XVLXvQ9pxsgOXIBh6CJLKuWW+ROySftRYURLypX8kvjl480Uvp >> >> iosJBrfG9VCP6WGaRTFqLr7ncGr7kSafiAlnUYnfkK9j6DgZZMv31gynAD+uyjQYgmI9 >> >> U01YKPiG0nzWf2usFbSFS0ZwNU0iPCeWGzWZsTi4irbpOJGwh0H1bfORasby80kg2VPW >> >> ECUbqM8luLRGqp+JigZzSB6nmMdTiWjFrVjFdVDc1a2MMqZH7Bx9/0f3STIglhFTYolj >> CtvA== >> MIME-Version: 1.0 >> X-Received: by 10.50.70.98 with SMTP id l2mr2264433igu.52.1443762402446; >> Thu, >> 01 Oct 2015 22:06:42 -0700 (PDT) >> Received: by 10.107.15.210 with HTTP; Thu, 1 Oct 2015 22:06:42 -0700 (PDT) >> In-Reply-To: <[email protected]> >> References: <[email protected]> >> Date: Fri, 2 Oct 2015 10:36:42 +0530 >> Message-ID: <CAA8xGAEzME9N= >> [email protected]> >> Subject: Re: confirm subscribe to [email protected] >> From: Shiva Ram <[email protected]> >> To: user-sc.1443762280.dmfagcompebfcpjencib-shivaram.hadoop2015= >> [email protected] >> Content-Type: multipart/alternative; boundary=047d7b3a959223534105211821a4 >> >> > -- Best regards, Ahmed Vila | Senior software developer DevLogic | Sarajevo | Bosnia and Herzegovina Office : +387 33 942 123 Mobile: +387 62 139 348 Website: www.devlogic.eu E-mail : [email protected] --------------------------------------------------------------------- This e-mail and any attachment is for authorised use by the intended recipient(s) only. This email contains confidential information. It should not be copied, disclosed to, retained or used by, any party other than the intended recipient. Any unauthorised distribution, dissemination or copying of this E-mail or its attachments, and/or any use of any information contained in them, is strictly prohibited and may be illegal. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender directly via email. Any emails that you send to us may be monitored by systems or persons other than the named communicant for the purposes of ascertaining whether the communication complies with the law and company policies. -- --------------------------------------------------------------------- This e-mail and any attachment is for authorised use by the intended recipient(s) only. This email contains confidential information. It should not be copied, disclosed to, retained or used by, any party other than the intended recipient. Any unauthorised distribution, dissemination or copying of this E-mail or its attachments, and/or any use of any information contained in them, is strictly prohibited and may be illegal. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender directly via email. Any emails that you send to us may be monitored by systems or persons other than the named communicant for the purposes of ascertaining whether the communication complies with the law and company policies.
