It isn't a big deal to increase the Znode size, but it is bad practice. ZK isn't a file store. It is a coordination server. The size limit is intended to prevent large operations slowing down other operations. If you aren't sharing your ZK or your neighbors don't have response time expectations, bumping the size is fine.
Keep in mind also that the ALR seems to lock down the learning rate too quickly for many problems. I haven't had time to investigate, but it is a good idea to treat ALR models with some caution. They shouldn't be crazy off-base, but just are likely to be less converged than is desirable. On Tue, Aug 28, 2012 at 5:41 PM, Brandon Root <[email protected]> wrote: > Hey! First off, Mahout is pretty much the bee's knees. > > Anyhoo, I'm deploying my mahout classifier using zookeeper, following the > technique from Mahout in Action, but my models often exceed the 1M limit > zookeeper wants you to stick to. I'm using the AdaptiveLogisticRegression > algorithm, but I think I'm doing all the things I'm supposed to (only > serializing the best model etc) > > Here is my code: > > ModelSerializer.writeBinary("/var/www/shared/model/products.model", > learningAlgorithm.getBest().getPayload().getLearner().getModels().get(0)); > > I feel like i'm missing something, most of my models are clocking in at > something like 1.8m. The complete model is of course somewhere around 200m. > > Do most people boost the znode size? Am I simply being too ambitious with > the number of features I'm using? > > ZooKeeper lists boosting znode size under "unsafe operations," but I don't > know how big a deal this is. > > (From http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html) > > *(Java system property: jute.maxbuffer) > > This option can only be set as a Java system property. There is no > zookeeper prefix on it. It specifies the maximum size of the data that can > be stored in a znode. The default is 0xfffff, or just under 1M. If this > option is changed, the system property must be set on all servers and > clients otherwise problems will arise. This is really a sanity check. > ZooKeeper is designed to store data on the order of kilobytes in size.* > > Any help would be much appreciated, thanks! > > Brandon Root >
