[GitHub] flume pull request #250: FLUME-3146 Use public API HdfsDataOutputStream#getC...
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/250 FLUME-3146 Use public API HdfsDataOutputStream#getCurrentBlockReplica⦠â¦tion where applicable Took over this issue from Wei-Chiu Chuang. Added a few lines to the tests. All tests pass, the new feature works. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3146 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/250.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #250 commit c41c97dd7d8b4d92ba99e27d404dc2ddc1b3e7ee Author: Endre Major Date: 2018-11-27T10:14:06Z FLUME-3146 Use public API HdfsDataOutputStream#getCurrentBlockReplication where applicable ---
[GitHub] flume pull request #246: FLUME-2723 batch size trans cap doc update
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/246 FLUME-2723 batch size trans cap doc update An update to the configuration section of the user guide. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-2723 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/246.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #246 commit c28d2d53dca28446ab03134b5bc5363b2e15e08e Author: Endre Major Date: 2018-11-23T09:12:49Z FLUME-2723 batch size trans cap doc update ---
[GitHub] flume pull request #244: FLUME-2989 added 2 KafkaChannel metrics
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/244 FLUME-2989 added 2 KafkaChannel metrics KafkaChannel was missing some metrics: eventTakeAttemptCount, eventPutAttemptCount This PR is based on the patch included in the issue that was the work of Umesh Chaudhary. I reworked the test a bit to use Mockito, and made some other minor modifications to the test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-2989 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/244.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #244 commit 8de6c94089ddbfbd88e2d13a47c91fa0800bc7d6 Author: Endre Major Date: 2018-11-22T22:09:38Z FLUME-2989 added KafkaChannel metrics eventTakeAttemptCount, eventPutAttemptCount ---
[GitHub] flume pull request #243: FLUME-3243 hdfs.callTimeout deafault increased and ...
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/243 FLUME-3243 hdfs.callTimeout deafault increased and deprecated The default hdfs.callTimeout used by the HDFS sink was too low only 10 seconds that can cause problems on a busy system. The new default is 30 sec. I think this parameter should be deprecated and some new more error tolerant solution should be used. To enable the future change I indicated this in the code and in the Users Guide. Tested only with the unit tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3243 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/243.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #243 commit 24c9e5f781fd7ca53c061f1ce5f9a6a555bf95c3 Author: Endre Major Date: 2018-11-22T20:19:41Z FLUME-3243 hdfs.callTimeout deafault increased and deprecated ---
[GitHub] flume pull request #242: FLUME-1342 adding jmx metrics tables to docs
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/242 FLUME-1342 adding jmx metrics tables to docs This PR adds a few tables to the User Guide that describe the metrics published by sorurces, sinks and channels. I used simple unix tools to gather the data then I wrote a small utility to convert it to csv. Then I used an online converter https://www.tablesgenerator.com/ to generate the rst tables and then a little manual editing. I discovered some rst formatting problems in the FlumeUserGuide.rst, corrected them, too. It was rather painful process to gather the data and find a decent representation. So far this PR only contains the end result. I would be happy to share the utilities, just don't know what would be the best way. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-1342 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/242.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #242 commit e2bbbc85dd3d07425322f2a335bea4918f071b44 Author: Endre Major Date: 2018-11-22T18:52:45Z FLUME-1342 adding jmx metrics tables to docs ---
[GitHub] flume pull request #237: FLUME-2653 Allow hdfs sink inUseSuffix to be empty
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/237 FLUME-2653 Allow hdfs sink inUseSuffix to be empty This is based on the contributions for FLUME-2653 regarding a new feature for the hdfs sink. Added a new parameter hdfs.emptyInUseSuffix to allow the output file name to remain unchanged. See the user guide changes for details. This is desired feature from the community. I added a new junit test case for testing. Temporarily modified old test cases in my ide to use the new flag, and they passed. I did this just as one of test, to be on the safe side. It is not in this PR. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-2653 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/237.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #237 commit 930476595e70b2ecb5fd3a21a732b82391d351f8 Author: Endre Major Date: 2018-11-14T17:44:02Z FLUME-2653 Allow inUseSuffix to be empty ---
[GitHub] flume pull request #235: Flume 3281 Update to Kafka 2.0
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/235 Flume 3281 Update to Kafka 2.0 This has been tested with unit tests. The main difference that caused the most problems is the consumer.poll(Duration) change. This does not block even when it fetches meta data whereas the previous poll(long timeout) blocked indefinitely for meta data fetching. This has resulted in many test timing issues. I tried to do minimal changes at the tests, just enough to make them pass. Kafka 2.0 requires a higher version for slf4j, I had to update it to 1.7.25. Option migrateZookeeperOffsets is deprecated in this PR. This will allow us to get rid of Kafka server libraries in Flume. Compatibility testing. Modified the TestUtil to be able to use external servers. This I could test against a variety of Kafka Server versions using the normal unit tests. Channel tests using 2.0.1 client: Kafka_2.11-0.11.0.3 - timeouts in TestPartitions when creating topics Kafka_2.11-1.0.2 - passed Kafka_2.11-1.1.1 - passed Kafka_2.11-2.0.1 - passed I will publish further results here, today or tomorrow. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3281 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/235.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #235 commit 2b6818ad2f8c9d8367ba5526f800583f29967464 Author: Endre Major Date: 2018-11-08T14:36:47Z FLUME-3281 Update to Kafka 2.0 commit e1a98bf98bdfa2fc3f524e94ed5e5603477f7820 Author: Endre Major Date: 2018-11-10T16:14:06Z FLUME-3281 TestUtil improvements external servers commit 34ce07c9ba3cc388630a6fd2e5cb14f94b665ca5 Author: Endre Major Date: 2018-11-12T13:31:34Z FLUME-3281 Deprecating migrateZookeeperOffsets ---
[GitHub] flume pull request #229: FLUME-3223 Flume HDFS Sink should retry close prior...
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/229 FLUME-3223 Flume HDFS Sink should retry close prior release lease This is based on @mcsanady 's original pull request #202 I have took the test changes from him but reworked the new feature implementation since it failed some unit tests. Previously when a close failed we immediately did a recover lease. This PR introduces a background retry mechanism. It uses the already existing "hdfs.closeTries" parameter. Unfortunately it has infinite retries by default, that seems a bit too long for me. I also did a minimal code clean up. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3223 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/229.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #229 commit 5514f0489ae6091bec5eea814a4c8a9990eede35 Author: Endre Major Date: 2018-09-27T12:05:38Z FLUME-3223 Flume HDFS Sink should retry close prior release lease ---
[GitHub] flume pull request #226: FLUME-2973 BucketWriter deadlock fix
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/226 FLUME-2973 BucketWriter deadlock fix This PR is based on Yan Jian's fix and his test improvements. Also contains the deadlock reproduction contributed by @adenes. I have made minimal changes to those contributions. Denes's test was used for checking the fix. Yan's fix contains an optimization as it first calls the callback function that removes the BucketWriter from the cache. This is useful, should help to avoid some errors. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-2973 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/226.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #226 ---
[GitHub] flume pull request #222: FLUME-3050 add counters for error conditions and ex...
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/222 FLUME-3050 add counters for error conditions and expose to monitor URL Concept: an error is when an Exception is thrown or an ERROR level log is written during event processing. In case of an error at least 1 error counter is increased at least once. (Preferably 1 counter once). Errors during event processing are counted. Initialization errors are not handled here. 3 types of errors are differentiated. -Channel read/write errors from the channel when the channel throws a ChannelException. -Event read/write errors. E.g: A source cannot read an event due to -Generic errors - e.g.: TaildirSource cannot write position file. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3050 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/222.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #222 commit c82d23011aa5dcc47df997f47792e8ececf88303 Author: emajor Date: 2018-07-20T15:38:34Z FLUME-3050 WIP commit 8245d210f186fef06a3f7d996116f7c02e66f552 Author: emajor Date: 2018-07-24T12:43:24Z WIP commit 83ae524a37acfcdd2442128fc19b26cdf30f1b45 Author: emajor Date: 2018-07-30T10:03:39Z WIP tests commit eecd494a6b0c7e2000398429520014a143f8ea30 Author: emajor Date: 2018-07-30T11:50:30Z clean up 1 commit b4c9afabd4621d5f68a403644c75bc2c3f211be4 Author: emajor Date: 2018-07-30T16:12:24Z clean up 2 commit cc1d88abc31c5ae81cc16842d5d14418e5176b8b Author: emajor Date: 2018-07-30T16:17:59Z clean up 3 commit 37594abeb2fbd2d695d3585d0351d7295810b5c4 Author: emajor Date: 2018-07-31T14:57:39Z WIP adding further tests commit bc6e4fc18ecfabd0e2a8c9f7911573ee50ce60e7 Author: emajor Date: 2018-08-01T16:40:31Z further tests commit d200eda3195f84b89580aabd5bdac19a9c8c0f8e Author: emajor Date: 2018-08-06T09:45:47Z morphline error counter added commit dd851dda8d3d95c1a37563a9012e153c79a17b37 Author: emajor Date: 2018-08-06T13:51:15Z cleanup and test fix commit 63dff5781adeaab7d8aea74a45e0e9b33e2be06b Author: emajor Date: 2018-08-06T15:23:11Z Adding error counters to ScribeSource ---
[GitHub] flume pull request #214: FLUME-3239 Do not rename files in SpoolDirectorySou...
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/214 FLUME-3239 Do not rename files in SpoolDirectorySource Added functionality to track files in the meta directory rather than renaming them. Improved tests for checking multilevel directories. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3239 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/214.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #214 commit 878088fd970cfcaff9ed0a1ce656870f22348532 Author: Endre Major Date: 2018-06-05T12:35:40Z FLUME-3239 Do not rename files in SpoolDirectorySource ---
[GitHub] flume pull request #212: WIP FLUME-3246 Validate flume configuration to prev...
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/212 WIP FLUME-3246 Validate flume configuration to prevent larger source batc⦠â¦h size than the channel transaction capacity The loadSources() method seemed like an appropriate place to check this. Added 2 new interfaces for getting the transaction capacity and the batch size fields. The check is only done for channels that implement the TransactioCapacitySupported interface and sources that implement the BatchSizeSupported interface. There is a new unit test case that I used for testing. TODOs: Add the BatchSizeSupported interface to all the sources that handle batch size. Check how this works when reloading configuration. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3246 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/212.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #212 commit b548e41f4299e45a3b9e1f74c080203c4c301774 Author: emajor Date: 2018-06-19T12:54:50Z FLUME-3246 Validate flume configuration to prevent larger source batch size than the channel transaction capacity ---
[GitHub] flume pull request #211: FLUME-3239 Do not rename files in SpoolDirectorySou...
Github user majorendre closed the pull request at: https://github.com/apache/flume/pull/211 ---
[GitHub] flume pull request #211: FLUME-3239 Do not rename files in SpoolDirectorySou...
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/211 FLUME-3239 Do not rename files in SpoolDirectorySource WIP Work in progress. Added functionality to track files in the meta directory rather than renaming them. This is an early preview to check if the direction is right. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3239 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/211.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #211 commit c0695ad01f172248341eec849ca2b4d0848819b1 Author: Endre Major Date: 2018-06-05T12:35:40Z FLUME-3239 Do not rename files in SpoolDirectorySource ---
[GitHub] flume pull request #208: FLUME-3222 Fix for NoSuchFileException thrown when ...
GitHub user majorendre opened a pull request: https://github.com/apache/flume/pull/208 FLUME-3222 Fix for NoSuchFileException thrown when files are being de⦠â¦leted from the TAILDIR source We fetch file names from a directory and later we fetch inodes. If there is a delete between these operations this problem occurs. Reproduced from unit test. Added exception handling to handle this case. It is enough to ignore the NoSuchFileException and continue. You can merge this pull request into a Git repository by running: $ git pull https://github.com/majorendre/flume FLUME-3222 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/208.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #208 commit c291b621514f5aa1e0f9fcdc5ba897c66d4ce43f Author: Endre Major Date: 2018-05-29T14:31:27Z FLUME-3222 Fix for NoSuchFileException thrown when files are being deleted from the TAILDIR source ---