Ah, ok. One of the comments on the issue led me to believe that it was the same issue as the missing custom log closer.
On Sat, Jun 23, 2018, 01:10 Stephen Meyles <smey...@gmail.com> wrote: > > I'm not convinced this is a write pattern issue, though. I commented > on.. > > The note there suggests the need for a LogCloser implementation; in my > (ADLS) case I've written one and have it configured - the exception I'm > seeing involves failures during writes, not during recovery (though it then > leads to a need for recovery). > > S. > > On Fri, Jun 22, 2018 at 4:33 PM, Christopher <ctubb...@apache.org> wrote: > >> Unfortunately, that feature wasn't added until 2.0, which hasn't yet been >> released, but I'm hoping it will be later this year. >> >> However, I'm not convinced this is a write pattern issue, though. I >> commented on >> https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543 >> >> On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <smey...@gmail.com> wrote: >> >>> Knowing that HBase has been run successfully on ADLS, went looking there >>> (as they have the same WAL write pattern). This is informative: >>> >>> >>> https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_using_adls_storage_with_hbase.html >>> >>> which suggests a need to split the WALs off on HDFS proper versus ADLS >>> (or presumably GCS) barring changes in the underlying semantics of each. >>> AFAICT you can't currently configure Accumulo to send WAL logs to a >>> separate cluster - is this correct? >>> >>> S. >>> >>> >>> On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <smey...@gmail.com> >>> wrote: >>> >>>> > Did you try to adjust any Accumulo properties to do bigger writes >>>> less frequently or something like that? >>>> >>>> We're using BatchWriters and sending reasonable larges batches of >>>> Mutations. Given the stack traces in both our cases are related to WAL >>>> writes it seems like batch size would be the only tweak available here >>>> (though, without reading the code carefully it's not even clear to me that >>>> is impactful) but if there others have suggestions I'd be happy to try. >>>> >>>> Given we have this working well and stable in other clusters atop >>>> traditional HDFS I'm currently pursuing this further with the MS to >>>> understand the variance to ADLS. Depending what emerges from that I may >>>> circle back with more details and a bug report and start digging in more >>>> deeply to the relevant code in Accumulo. >>>> >>>> S. >>>> >>>> >>>> On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <kolchin...@gmail.com> >>>> wrote: >>>> >>>>> > If somebody is interested in using Accumulo on GCS, I'd like to >>>>> encourage them to submit any bugs they encounter, and any patches (if they >>>>> are able) which resolve those bugs. >>>>> >>>>> I'd like to contribute a fix, but I don't know where to start. We >>>>> tried to get any help from the Google Support about [1] over email, but >>>>> they just say that the GCS doesn't support such write pattern. In the end, >>>>> we can only guess how to adjust the Accumulo behaviour to minimise broken >>>>> connections to the GCS. >>>>> >>>>> BTW although we observe this exception, the tablet server doesn't >>>>> fail, so it means that after some retries it is able to write WALs to GCS. >>>>> >>>>> @Stephen, >>>>> >>>>> > as discussions with MS engineers have suggested, similar to the GCS >>>>> thread, that small writes at high volume are, at best, suboptimal for >>>>> ADLS. >>>>> >>>>> Did you try to adjust any Accumulo properties to do bigger writes less >>>>> frequently or something like that? >>>>> >>>>> [1]: https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103 >>>>> >>>>> Maxim >>>>> >>>>> On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <smey...@gmail.com> >>>>> wrote: >>>>> >>>>>> I think we're seeing something similar but in our case we're trying >>>>>> to run Accumulo atop ADLS. When we generate sufficient write load we >>>>>> start >>>>>> to see stack traces like the following: >>>>>> >>>>>> [log.DfsLogger] ERROR: Failed to write log entries >>>>>> java.io.IOException: attempting to write to a closed stream; >>>>>> at >>>>>> com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88) >>>>>> at >>>>>> com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77) >>>>>> at >>>>>> org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57) >>>>>> at >>>>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48) >>>>>> at java.io.DataOutputStream.write(DataOutputStream.java:88) >>>>>> at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) >>>>>> at >>>>>> org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87) >>>>>> at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537) >>>>>> >>>>>> We have developed a rudimentary LogCloser implementation that allows >>>>>> us to recover from this but overall performance is significantly impacted >>>>>> by this. >>>>>> >>>>>> > As for the WAL closing issue on GCS, I recall a previous thread >>>>>> about that >>>>>> >>>>>> I searched more for this but wasn't able to find anything, nor >>>>>> similar re: ADL. I am also curious about the earlier question: >>>>>> >>>>>> >> Does Accumulo have a specific write pattern [to WALs], so that >>>>>> file system may not support it? >>>>>> >>>>>> as discussions with MS engineers have suggested, similar to the GCS >>>>>> thread, that small writes at high volume are, at best, suboptimal for >>>>>> ADLS. >>>>>> >>>>>> Regards >>>>>> >>>>>> Stephen >>>>>> >>>>>> >>>>>> On Wed, Jun 20, 2018 at 11:20 AM, Christopher <ctubb...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> For what it's worth, this is an Apache project, not a Sqrrl project. >>>>>>> Amazon is free to contribute to Accumulo to improve its support of their >>>>>>> platform, just as anybody is free to do. Amazon may start contributing >>>>>>> more >>>>>>> as a result of their acquisition... or they may not. There is no reason >>>>>>> to >>>>>>> expect that their acquisition will have any impact whatsoever on the >>>>>>> platforms Accumulo supports, because Accumulo is not, and has not ever >>>>>>> been, a Sqrrl project (although some Sqrrl employees have contributed), >>>>>>> and >>>>>>> thus will not become an Amazon project. It has been, and will remain, a >>>>>>> vendor-neutral Apache project. Regardless, we welcome contributions from >>>>>>> anybody which would improve Accumulo's support of any additional >>>>>>> platform >>>>>>> alternatives to HDFS, whether it be GCS, S3, or something else. >>>>>>> >>>>>>> As for the WAL closing issue on GCS, I recall a previous thread >>>>>>> about that... I think a simple patch might be possible to solve that >>>>>>> issue, >>>>>>> but to date, nobody has contributed a fix. If somebody is interested in >>>>>>> using Accumulo on GCS, I'd like to encourage them to submit any bugs >>>>>>> they >>>>>>> encounter, and any patches (if they are able) which resolve those bugs. >>>>>>> If >>>>>>> they need help submitting a fix, please ask on the dev@ list. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts < >>>>>>> threadedb...@gmail.com> wrote: >>>>>>> >>>>>>>> Maxim, >>>>>>>> >>>>>>>> Interesting that you were able to run A on GCS. I never thought of >>>>>>>> that--good to know. >>>>>>>> >>>>>>>> Since I am now an AWS guy (at least or the time being), in light of >>>>>>>> the fact that Amazon purchased Sqrrl, I am interested to see what >>>>>>>> develops. >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin < >>>>>>>> kolchin...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Geoffry, >>>>>>>>> >>>>>>>>> Thank you for the feedback! >>>>>>>>> >>>>>>>>> Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs >>>>>>>>> and with GCS instead of HDFS. And I used Google Dataproc to run >>>>>>>>> Hadoop jobs >>>>>>>>> on Accumulo. Almost everything was good until I've not faced some >>>>>>>>> connection issues with GCS. Quite often, the connection to GCS breaks >>>>>>>>> on >>>>>>>>> writing or closing WALs. >>>>>>>>> >>>>>>>>> To all, >>>>>>>>> >>>>>>>>> Does Accumulo have a specific write pattern, so that file system >>>>>>>>> may not support it? Are there Accumulo properties which I can play >>>>>>>>> with to >>>>>>>>> adjust the write pattern? >>>>>>>>> >>>>>>>>> [1]: https://github.com/cybermaggedon/accumulo-gs >>>>>>>>> [2]: https://github.com/cybermaggedon/accumulo-docker >>>>>>>>> >>>>>>>>> Thank you! >>>>>>>>> Maxim >>>>>>>>> >>>>>>>>> On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts < >>>>>>>>> threadedb...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> I tried running Accumulo on Google. I first tried running it on >>>>>>>>>> Google's pre-made Hadoop. I found the various file paths one must >>>>>>>>>> contend >>>>>>>>>> with are different on Google than on a straight download from >>>>>>>>>> Apache. It >>>>>>>>>> seems they moved things around. To counter this, I installed my own >>>>>>>>>> Hadoop >>>>>>>>>> along with Zookeeper and Accumulo on a Google node. All went well >>>>>>>>>> until >>>>>>>>>> one fine day when I could no longer log in. It seems Google had >>>>>>>>>> pushed out >>>>>>>>>> some changes over night that broke my client side Google Cloud >>>>>>>>>> installation. Google referred the affected to a lengthy, >>>>>>>>>> easy-to-make-a-mistake procedure for resolving the issue. >>>>>>>>>> >>>>>>>>>> I decided life was too short for this kind of thing and switched >>>>>>>>>> to Amazon. >>>>>>>>>> >>>>>>>>>> On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin < >>>>>>>>>> kolchin...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Does anyone have experience running Accumulo on top of Google >>>>>>>>>>> Cloud Storage instead of HDFS? In [1] you can see some details if >>>>>>>>>>> you never >>>>>>>>>>> heard about this feature. >>>>>>>>>>> >>>>>>>>>>> I see some discussion (see [2], [3]) around this topic, but it >>>>>>>>>>> looks to me that this isn't as popular as, I believe, should be. >>>>>>>>>>> >>>>>>>>>>> [1]: >>>>>>>>>>> https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage >>>>>>>>>>> [2]: https://github.com/apache/accumulo/issues/428 >>>>>>>>>>> [3]: >>>>>>>>>>> https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103 >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Maxim >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> There are ways and there are ways, >>>>>>>>>> >>>>>>>>>> Geoffry Roberts >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> There are ways and there are ways, >>>>>>>> >>>>>>>> Geoffry Roberts >>>>>>>> >>>>>>> >>>>>> >>>>