Re: Accumulo on Google Cloud Storage

THORMAN, ROBERT D Wed, 16 Jan 2019 09:53:29 -0800

And will you be so kind as to share the link with this email distro please?


On 1/16/19, 11:41 AM, "Keith Turner" <[email protected]> wrote:

    Maxim,
    
    This is very interesting.  Would you be interested in writing an
    Accumulo blog post about your experience?  If you are interested I can
    help.
    
    Keith
    
    On Tue, Jan 15, 2019 at 10:03 AM Maxim Kolchin <[email protected]> wrote:
    >
    > Hi,
    >
    > I just wanted to leave intermediate feedback on the topic.
    >
    > So far, Accumulo works pretty well on top of Google Storage. The 
aforementioned issue still exists, but it doesn't break anything. However, I 
can't give you any useful performance numbers at the moment.
    >
    > The cluster:
    >
    >  - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4)
    >  - 32+ billlion entries
    >  - 5 tables (excluding system tables)
    >
    > Some averaged numbers from two use cases:
    >
    >  - batch write into pre-splitted tables with 40 client machines + 4 
tservers (n1-standard-4) - max speed 1.5M entries/sec.
    >  - sequential read with 2 client iterators (1 - filters by key, 2- 
filters by timestamp), with 5 client machines +  2 tservers (n1-standard-4 ) 
and less than 60k entries returned - max speed 1M+ entries/sec.
    >
    > Maxim
    >
    > On Mon, Jun 25, 2018 at 12:57 AM Christopher <[email protected]> wrote:
    >>
    >> Ah, ok. One of the comments on the issue led me to believe that it was 
the same issue as the missing custom log closer.
    >>
    >> On Sat, Jun 23, 2018, 01:10 Stephen Meyles <[email protected]> wrote:
    >>>
    >>> > I'm not convinced this is a write pattern issue, though. I commented 
on..
    >>>
    >>> The note there suggests the need for a LogCloser implementation; in my 
(ADLS) case I've written one and have it configured - the exception I'm seeing 
involves failures during writes, not during recovery (though it then leads to a 
need for recovery).
    >>>
    >>> S.
    >>>
    >>> On Fri, Jun 22, 2018 at 4:33 PM, Christopher <[email protected]> 
wrote:
    >>>>
    >>>> Unfortunately, that feature wasn't added until 2.0, which hasn't yet 
been released, but I'm hoping it will be later this year.
    >>>>
    >>>> However, I'm not convinced this is a write pattern issue, though. I 
commented on 
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GoogleCloudPlatform_bigdata-2Dinterop_issues_103-23issuecomment-2D399608543&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=RrpeKiRhbHsbTPStmmN3yjoOdB8n7TXkrQdYjvqOb54&e=
    >>>>
    >>>> On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <[email protected]> 
wrote:
    >>>>>
    >>>>> Knowing that HBase has been run successfully on ADLS, went looking 
there (as they have the same WAL write pattern). This is informative:
    >>>>>
    >>>>>     
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cloudera.com_documentation_enterprise_5-2D12-2Dx_topics_admin-5Fusing-5Fadls-5Fstorage-5Fwith-5Fhbase.html&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=A3SBTtG6DoNjpwDwd90-4Cnmi2WwE5TRxbAjBmzwaRI&e=
    >>>>>
    >>>>> which suggests a need to split the WALs off on HDFS proper versus 
ADLS (or presumably GCS) barring changes in the underlying semantics of each. 
AFAICT you can't currently configure Accumulo to send WAL logs to a separate 
cluster - is this correct?
    >>>>>
    >>>>> S.
    >>>>>
    >>>>>
    >>>>> On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <[email protected]> 
wrote:
    >>>>>>
    >>>>>> > Did you try to adjust any Accumulo properties to do bigger writes 
less frequently or something like that?
    >>>>>>
    >>>>>> We're using BatchWriters and sending reasonable larges batches of 
Mutations. Given the stack traces in both our cases are related to WAL writes 
it seems like batch size would be the only tweak available here (though, 
without reading the code carefully it's not even clear to me that is impactful) 
but if there others have suggestions I'd be happy to try.
    >>>>>>
    >>>>>> Given we have this working well and stable in other clusters atop 
traditional HDFS I'm currently pursuing this further with the MS to understand 
the variance to ADLS. Depending what emerges from that I may circle back with 
more details and a bug report and start digging in more deeply to the relevant 
code in Accumulo.
    >>>>>>
    >>>>>> S.
    >>>>>>
    >>>>>>
    >>>>>> On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin 
<[email protected]> wrote:
    >>>>>>>
    >>>>>>> > If somebody is interested in using Accumulo on GCS, I'd like to 
encourage them to submit any bugs they encounter, and any patches (if they are 
able) which resolve those bugs.
    >>>>>>>
    >>>>>>> I'd like to contribute a fix, but I don't know where to start. We 
tried to get any help from the Google Support about [1] over email, but they 
just say that the GCS doesn't support such write pattern. In the end, we can 
only guess how to adjust the Accumulo behaviour to minimise broken connections 
to the GCS.
    >>>>>>>
    >>>>>>> BTW although we observe this exception, the tablet server doesn't 
fail, so it means that after some retries it is able to write WALs to GCS.
    >>>>>>>
    >>>>>>> @Stephen,
    >>>>>>>
    >>>>>>> > as discussions with MS engineers have suggested, similar to the 
GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.
    >>>>>>>
    >>>>>>> Did you try to adjust any Accumulo properties to do bigger writes 
less frequently or something like that?
    >>>>>>>
    >>>>>>> [1]: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GoogleCloudPlatform_bigdata-2Dinterop_issues_103&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=2yPpLqw3V32UFtuTULJ4GIgrpBpvRT6k3sdvxxE7gys&e=
    >>>>>>>
    >>>>>>> Maxim
    >>>>>>>
    >>>>>>> On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <[email protected]> 
wrote:
    >>>>>>>>
    >>>>>>>> I think we're seeing something similar but in our case we're 
trying to run Accumulo atop ADLS. When we generate sufficient write load we 
start to see stack traces like the following:
    >>>>>>>>
    >>>>>>>> [log.DfsLogger] ERROR: Failed to write log entries
    >>>>>>>> java.io.IOException: attempting to write to a closed stream;
    >>>>>>>> at 
com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
    >>>>>>>> at 
com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
    >>>>>>>> at 
org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
    >>>>>>>> at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
    >>>>>>>> at java.io.DataOutputStream.write(DataOutputStream.java:88)
    >>>>>>>> at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
    >>>>>>>> at 
org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
    >>>>>>>> at 
org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)
    >>>>>>>>
    >>>>>>>> We have developed a rudimentary LogCloser implementation that 
allows us to recover from this but overall performance is significantly 
impacted by this.
    >>>>>>>>
    >>>>>>>> > As for the WAL closing issue on GCS, I recall a previous thread 
about that
    >>>>>>>>
    >>>>>>>> I searched more for this but wasn't able to find anything, nor 
similar re: ADL. I am also curious about the earlier question:
    >>>>>>>>
    >>>>>>>> >> Does Accumulo have a specific write pattern [to WALs], so that 
file system may not support it?
    >>>>>>>>
    >>>>>>>> as discussions with MS engineers have suggested, similar to the 
GCS thread, that small writes at high volume are, at best, suboptimal for ADLS.
    >>>>>>>>
    >>>>>>>> Regards
    >>>>>>>>
    >>>>>>>> Stephen
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> On Wed, Jun 20, 2018 at 11:20 AM, Christopher 
<[email protected]> wrote:
    >>>>>>>>>
    >>>>>>>>> For what it's worth, this is an Apache project, not a Sqrrl 
project. Amazon is free to contribute to Accumulo to improve its support of 
their platform, just as anybody is free to do. Amazon may start contributing 
more as a result of their acquisition... or they may not. There is no reason to 
expect that their acquisition will have any impact whatsoever on the platforms 
Accumulo supports, because Accumulo is not, and has not ever been, a Sqrrl 
project (although some Sqrrl employees have contributed), and thus will not 
become an Amazon project. It has been, and will remain, a vendor-neutral Apache 
project. Regardless, we welcome contributions from anybody which would improve 
Accumulo's support of any additional platform alternatives to HDFS, whether it 
be GCS, S3, or something else.
    >>>>>>>>>
    >>>>>>>>> As for the WAL closing issue on GCS, I recall a previous thread 
about that... I think a simple patch might be possible to solve that issue, but 
to date, nobody has contributed a fix. If somebody is interested in using 
Accumulo on GCS, I'd like to encourage them to submit any bugs they encounter, 
and any patches (if they are able) which resolve those bugs. If they need help 
submitting a fix, please ask on the dev@ list.
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>>
    >>>>>>>>> On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts 
<[email protected]> wrote:
    >>>>>>>>>>
    >>>>>>>>>> Maxim,
    >>>>>>>>>>
    >>>>>>>>>> Interesting that you were able to run A on GCS.  I never thought 
of that--good to know.
    >>>>>>>>>>
    >>>>>>>>>> Since I am now an AWS guy (at least or the time being), in light 
of the fact that Amazon purchased Sqrrl,  I am interested to see what develops.
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin 
<[email protected]> wrote:
    >>>>>>>>>>>
    >>>>>>>>>>> Hi Geoffry,
    >>>>>>>>>>>
    >>>>>>>>>>> Thank you for the feedback!
    >>>>>>>>>>>
    >>>>>>>>>>> Thanks to [1, 2], I was able to run Accumulo cluster on Google 
VMs and with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs 
on Accumulo. Almost everything was good until I've not faced some connection 
issues with GCS. Quite often, the connection to GCS breaks on writing or 
closing WALs.
    >>>>>>>>>>>
    >>>>>>>>>>> To all,
    >>>>>>>>>>>
    >>>>>>>>>>> Does Accumulo have a specific write pattern, so that file 
system may not support it? Are there Accumulo properties which I can play with 
to adjust the write pattern?
    >>>>>>>>>>>
    >>>>>>>>>>> [1]: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cybermaggedon_accumulo-2Dgs&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=K3sM4QEXbilBZ-bDW-ld4a7WxgTkHn5Ms4P_BaIvfmo&e=
    >>>>>>>>>>> [2]: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cybermaggedon_accumulo-2Ddocker&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=9jRnuv65SKmppVkDI1tKNRKcOZJDfXFiSRS5Pcxt2fU&e=
    >>>>>>>>>>>
    >>>>>>>>>>> Thank you!
    >>>>>>>>>>> Maxim
    >>>>>>>>>>>
    >>>>>>>>>>> On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts 
<[email protected]> wrote:
    >>>>>>>>>>>>
    >>>>>>>>>>>> I tried running Accumulo on Google.  I first tried running it 
on Google's pre-made Hadoop.  I found the various file paths one must contend 
with are different on Google than on a straight download from Apache.  It seems 
they moved things around.  To counter this, I installed my own Hadoop along 
with Zookeeper and Accumulo on a Google node.  All went well until one fine day 
when I could no longer log in.  It seems Google had pushed out some changes 
over night that broke my client side Google Cloud installation.  Google 
referred the affected to a lengthy, easy-to-make-a-mistake procedure for 
resolving the issue.
    >>>>>>>>>>>>
    >>>>>>>>>>>> I decided life was too short for this kind of thing and 
switched to Amazon.
    >>>>>>>>>>>>
    >>>>>>>>>>>> On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin 
<[email protected]> wrote:
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> Hi all,
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> Does anyone have experience running Accumulo on top of Google 
Cloud Storage instead of HDFS? In [1] you can see some details if you never 
heard about this feature.
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> I see some discussion (see [2], [3]) around this topic, but 
it looks to me that this isn't as popular as, I believe, should be.
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> [1]: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.google.com_dataproc_docs_concepts_connectors_cloud-2Dstorage&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=Ow_9xK5ABIEJsHBsXlCqQCJf63WEzC0RSrh1xTpVP5U&e=
    >>>>>>>>>>>>> [2]: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_accumulo_issues_428&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=98HiJiBLlqHr485MKj12gKhhd4wehE3n3VNWNCYYeH4&e=
    >>>>>>>>>>>>> [3]: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_GoogleCloudPlatform_bigdata-2Dinterop_issues_103&d=DwIFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tVgyCjxnNiJ0cjUcKRQFKA&m=Dh1LN-vxvgWAW9nDdnINj7ultMEGn6AjW4IGFLed30w&s=2yPpLqw3V32UFtuTULJ4GIgrpBpvRT6k3sdvxxE7gys&e=
    >>>>>>>>>>>>>
    >>>>>>>>>>>>> Best regards,
    >>>>>>>>>>>>> Maxim
    >>>>>>>>>>>>
    >>>>>>>>>>>>
    >>>>>>>>>>>>
    >>>>>>>>>>>>
    >>>>>>>>>>>> --
    >>>>>>>>>>>> There are ways and there are ways,
    >>>>>>>>>>>>
    >>>>>>>>>>>> Geoffry Roberts
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>>
    >>>>>>>>>> --
    >>>>>>>>>> There are ways and there are ways,
    >>>>>>>>>>
    >>>>>>>>>> Geoffry Roberts
    >>>>>>>>
    >>>>>>>>
    >>>>>>

Re: Accumulo on Google Cloud Storage

Reply via email to