I have made some updates to the blog based on feedback. It's attached here or you can see it at [1].
[1] https://blogs.apache.org/roller-ui/authoring/preview/accumulo/?previewEntry=the_accumulo_classloader - Dave -----Original Message----- From: Bill Havanki [mailto:[email protected]] Sent: Thursday, May 01, 2014 12:47 PM To: Accumulo Dev List Subject: Re: [VOTE] New Blog Entry Overall the article is great! I have suggested edits, so I'd like to know where I can stick them (don't be rude now ;) ). We've used Review Board for doc feedback in the past ... that's an OK way. Dave, I can just email them to you to avoid spamming. Let me know. On Wed, Apr 30, 2014 at 10:43 PM, Josh Elser <[email protected]> wrote: > Ah ok. I was just looking through the link you provided and didn't > notice an author at all. > > Just found it now in tiny letters at the bottom :) > > > On 4/30/14, 10:18 PM, dlmarion wrote: > >> I believe that the author is shown. Well, at least the person who >> posts it is shown. In this case it is one in the same. >> >> >> Sent via the Samsung GALAXY S®4, an AT&T 4G LTE smartphone >> >> -------- Original message -------- >> From: Josh Elser <[email protected]> >> Date:04/30/2014 9:51 PM (GMT-05:00) >> To: [email protected] >> Subject: Re: [VOTE] New Blog Entry >> >> It would be nice to include yourself as the author of the post. That >> would be nice to help users identify who created the content. >> >> On 4/30/14, 6:51 PM, [email protected] wrote: >> >>> >>> I have created a new entry for the blog. The preview feature does >>> not appear to be working at the moment. I will submit an INFRA issue for >>> this. >>> I have pasted the text below. For those that have a blog account, >>> you should be able to see the blog at [1]. This blog entry is set to >>> be published at 235959 3 May 2014 GMT pending no vetoes. This vote >>> will remain open for 72 hours, until 2300 3 May 2014 GMT. >>> >>> [1] https://blogs.apache.org/roller-ui/authoring/preview/ >>> accumulo/?previewEntry=the_accumulo_classloader >>> >>> - Dave >>> ------------------------------------------------------------ >>> ------------------------------------------------------------ >>> -------------------------------------------------------------- >>> >>> Blog Title: The Accumulo Classloader >>> >>> Blog Text: >>> >>> First, some history >>> >>> >>> The classloader in version 1.4 used a simple hierarchy of two >>> classloaders that would load classes from locations specified by two >>> properties. The locations specified by the "general.classpaths" >>> property would be used to create a parent classloader and locations >>> specified by the "general.dynamic.classpaths" property were used to >>> create a child classloader. The child classloader would monitor the >>> specified locations for changes and when a change occurred it would >>> replace the child classloader with a new instance. Classes that >>> referenced the orphaned child classloader would continue to work and >>> the classloader would be garbage collected when no longer referenced. >>> >>> The only place where the dynamic classloader would come into play is >>> for user iterators and their dependencies. The general advice for >>> using this classloader would be to put the jars containing your >>> iterators in the dynamic location. Everything else that does not >>> change very often or would require a restart can be put into the >>> non-dynamic location. >>> >>> There are a couple of things to note about the classloader in 1.4. >>> First, if you modified the dynamic locations too often, you would >>> run out of perm-gen space. This is likely due to unreferenced >>> classes not being unloaded from the JVM. This is captured in >>> ACCUMULO-599 . Secondly, when you modified files in dynamic >>> locations within the same cycle, it would on occasion miss the >>> second change. Out with the old, in with the new >>> >>> >>> The Accumulo classloader was rewritten in version 1.5. It maintains >>> the same dynamic capability and includes a couple of new features. >>> The classloader uses Commons VFS so that it can load jars and >>> classes from a variety of sources, including HDFS. Additionally, we >>> introduced the notion of classloader contexts into Accumulo. This is >>> not a new concept for anyone that has used an application server, >>> but the implementation is a little different for Accumulo. >>> >>> The hierarchy set up by the new classloader uses the same property >>> names as the old classloader. In the most basic configuration the >>> locations specified by "general.classpaths" are used to create the >>> root of the application classloader hierarchy. This classloader is a >>> URLClassLoader and it does not support dynamic reloading. If you >>> only specify this property, then you are loading all of your jars >>> from the local file system and they will not be monitored for >>> changes. We will call this top level application classloader the >>> SYSTEM classloader. Next, a classloader is created that supports VFS >>> sources and reloading. The parent of this classloader is the SYSTEM >>> classloader and we will call this the VFS classloader. If the >>> "general.vfs.classpaths" property is set, the VFS classloader will >>> use this location. If the property is not set, it will use the value >>> of "general.dynamic.classpaths" with a default value of >>> $ACCUMULO_HOME/lib/ext to support backwards compatibility. Running >>> Accumulo F >>> >> r > > o >> m HDFS >> >>> >>> >>> If you have defined "general.vfs.classpaths" in your Accumulo >>> configuration, then you can use the bootstrap_hdfs.sh script in the >>> bin directory to seed HDFS with the Accumulo jars. A couple of jars >>> will remain on the local file system for starting services. Now when >>> you start up Accumulo the master, gc, tracer, and all of the tablet >>> servers will get their jars and classes from HDFS. The >>> bootstrap_hdfs.sh script sets the replication on the directory, but >>> you may want to set it higher after bootstrapping. An example configuration >>> setting would be: >>> <property> >>> <name>general.vfs.classpaths</name> >>> <value>hdfs://localhost:8020/accumulo/system-classpath</value> >>> <description>Configuration for a system level vfs classloader. >>> Accumulo jars can be configured here and loaded out of HDFS.</description> >>> </property> >>> About Contexts >>> >>> >>> You can also define classloader contexts in your accumulo-site.xml file. >>> A context is defined by a user supplied name and it references >>> locations like the other classloader properties. When a context is >>> defined in the configuration, it can then be applied to one or more >>> tables. When a context is applied to a table, then a classloader is >>> created for that context. If multiple tables use the same context, >>> then they share the context classloader. The context classloader is >>> a child to the VFS classloader created above. >>> >>> The goal here is to enable multiple tenants to share the same >>> Accumulo instance. For example, we may have a context called 'app1' >>> which references the jars for application A. We may also have >>> another context called app2 which references the jars for >>> application B. By default the context classloader delegates to the >>> parent classloader. This behavior may be overridden as seen in the app2 >>> example below. >>> <property> >>> <name>general.vfs.context.classpath.app1</name> >>> <value>hdfs://localhost:8020/applicationA/classpath/.*.jar, >>> file:///opt/applicationA/lib/.*.jar</value> >>> <description>Application A classpath, loads jars from HDFS >>> and local file system</description> >>> </property> >>> >>> <property> >>> <name>general.vfs.context.classpath.app2.delegation=post</name> >>> <value>hdfs://localhost:8020/applicationB/classpath/.*.jar, >>> http://my-webserver/applicationB/.*.jar</value> >>> <description>Application B classpath, loads jars from HDFS >>> and HTTP, does not delegate to parent first</description> >>> </property> >>> >>> >>> Context classloaders do not have to be defined in the >>> accumulo-site.xml file. The >>> "general.vfs.context.classpath.{context}" property can be defined on >>> the table either programatically or manually in the shell. Then set >>> the "table.classpath.context" property on your table. Known Issues >>> >>> >>> >>> >>> >>> Remember the two issues I mentioned above? Well, they are still a >>> problem. >>> >>> * ACCUMULO-1507 is tracking VFS-487 for frequent >>> modifications to files. >>> * If you start running out of perm-gen space, take a look at >>> ACCUMULO-599 and try applying the JVM settings for class unloading. >>> * Additionally, there is an issue with the bootstrap_hdfs.sh >>> script detailed in ACCUMULO-2761 . There is a workaround listed in >>> the issue. >>> >>> >>> >>> I have disabled comments as I see they are being abused in other blogs. >>> Please email the dev list for comments and questions. >>> >>> -- // Bill Havanki // Solutions Architect, Cloudera Govt Solutions // 443.686.9283
