[jira] [Commented] (ACCUMULO-1972) Range constructors call overridable method
[ https://issues.apache.org/jira/browse/ACCUMULO-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16244997#comment-16244997 ] Christopher Tubbs commented on ACCUMULO-1972: - Thanks for the patch, [~coffeethulhu]. I have some concerns about the patch. Doing nothing means it is possible for careful developers to override the method. Applying this fix prevents users from doing the thing they needed to do that would have caused the bug in the first place (that is, override the method). I don't think this is the best fix we can do, as it breaks the API and prevents the only use case that would have been affected by the bug. It seems to me that a better fix might be to move the implementation to a private method, which can be safely called in the constructor, and also be called in the public method. That way, if the user overrides the public method, the behavior of the constructor will be unaffected. This fix would also allows us to avoid an API breakage, which means we can fix it in 1.7.x and 1.8.x, instead of waiting for 2.x. > Range constructors call overridable method > -- > > Key: ACCUMULO-1972 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1972 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.4.4, 1.5.0 >Reporter: Bill Havanki >Priority: Minor > Labels: newbie > Attachments: accumulo-1972.patch > > > Several {{Range}} constructors call {{Range.beforeStartKey()}}, which is not > final. This is dangerous: > bq. The superclass constructor runs before the subclass constructor, so the > overriding method in the subclass will get invoked before the subclass > constructor has run. If the overriding method depends on any initialization > performed by the subclass constructor, the method will not behave as > expected. ??Item 17, Effective Java Vol. 2, Bloch?? > If {{beforeStartKey()}} cannot be made final, the code should be refactored > to make the constructors safe. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] jmark99 commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers
jmark99 commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#discussion_r149807799 ## File path: _includes/nav.html ## @@ -24,6 +24,8 @@ Documentation + Quickstart (1.x) + Quickstart (2.x) Review comment: I'll remove the link as it the info already exists in the quick-install file. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jmark99 commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers
jmark99 commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#discussion_r149807353 ## File path: pages/INSTALL.md ## @@ -0,0 +1,1182 @@ + + + + + + + + Review comment: Not sure. I do not recall creating it. I'll delete it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] keith-turner commented on a change in pull request #320: ACCUMULO-4730 Created EntryLengthSummarizer
keith-turner commented on a change in pull request #320: ACCUMULO-4730 Created EntryLengthSummarizer URL: https://github.com/apache/accumulo/pull/320#discussion_r149801350 ## File path: core/src/main/java/org/apache/accumulo/core/client/summary/summarizers/EntryLengthSummarizer.java ## @@ -0,0 +1,329 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License");you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.accumulo.core.client.summary.summarizers; + +import java.math.RoundingMode; + +import org.apache.accumulo.core.client.summary.Summarizer; +import org.apache.accumulo.core.client.summary.SummarizerConfiguration; +import org.apache.accumulo.core.data.Key; +import org.apache.accumulo.core.data.Value; + +import com.google.common.math.IntMath; + +/** + * Summarizer that computes summary information about field lengths. + * Specifically key length, row length, family length, qualifier length, visibility length, and value length. + * Incrementally computes minimum, maximum, count, sum, and log2 histogram of the lengths. + */ +public class EntryLengthSummarizer implements Summarizer { + + public static final String MIN_KEY_STAT = "minKey"; + public static final String MAX_KEY_STAT = "maxKey"; + public static final String SUM_KEYS_STAT = "sumKeys"; + + public static final String MIN_ROW_STAT = "minRow"; + public static final String MAX_ROW_STAT = "maxRow"; + public static final String SUM_ROWS_STAT = "sumRows"; + + public static final String MIN_FAMILY_STAT = "minFamily"; + public static final String MAX_FAMILY_STAT = "maxFamily"; + public static final String SUM_FAMILIES_STAT = "sumFamilies"; + + public static final String MIN_QUALIFIER_STAT = "minQualifier"; + public static final String MAX_QUALIFIER_STAT = "maxQualifier"; + public static final String SUM_QUALIFIERS_STAT = "sumQualifiers"; + + public static final String MIN_VISIBILITY_STAT = "minVisibility"; + public static final String MAX_VISIBILITY_STAT = "maxVisibility"; + public static final String SUM_VISIBILITIES_STAT = "sumVisibilities"; + + public static final String MIN_VALUE_STAT = "minValue"; + public static final String MAX_VALUE_STAT = "maxValue"; + public static final String SUM_VALUES_STAT = "sumValues"; + + public static final String TOTAL_STAT = "total";// Total number of Keys + + @Override + public Collector collector(SummarizerConfiguration sc) { +return new Collector() { + + private long minKey = Long.MAX_VALUE; + private long maxKey = Long.MIN_VALUE; + private long sumKeys = 0; + private long[] keyCounts = new long[32]; + + private long minRow = Long.MAX_VALUE; + private long maxRow = Long.MIN_VALUE; + private long sumRows = 0; + private long[] rowCounts = new long[32]; + + private long minFamily = Long.MAX_VALUE; + private long maxFamily = Long.MIN_VALUE; + private long sumFamilies = 0; + private long[] familyCounts = new long[32]; + + private long minQualifier = Long.MAX_VALUE; + private long maxQualifier = Long.MIN_VALUE; + private long sumQualifiers = 0; + private long[] qualifierCounts = new long[32]; + + private long minVisibility = Long.MAX_VALUE; + private long maxVisibility = Long.MIN_VALUE; + private long sumVisibilities = 0; + private long[] visibilityCounts = new long[32]; + + private long minValue = Long.MAX_VALUE; + private long maxValue = Long.MIN_VALUE; + private long sumValues = 0; + private long[] valueCounts = new long[32]; + + private long total = 0; + + @Override + public void accept(Key k, Value v) { +int idx; + +// KEYS +if (k.getLength() < minKey) { + minKey = k.getLength(); +} + +if (k.getLength() > maxKey) { + maxKey = k.getLength(); +} + +sumKeys += k.getLength(); + +if (k.getLength() == 0) { + idx = 0; +} else { + idx = IntMath.log2(k.getLength(), RoundingMode.HALF_UP); +} + +keyCounts[idx]++; + +// ROWS +if (k.getRowData().length() < minRow) { + minRow = k.getRowData().length(); +} + +if (k.getRowData().length() > maxRow) { + maxRow =
[GitHub] keith-turner commented on a change in pull request #320: ACCUMULO-4730 Created EntryLengthSummarizer
keith-turner commented on a change in pull request #320: ACCUMULO-4730 Created EntryLengthSummarizer URL: https://github.com/apache/accumulo/pull/320#discussion_r149797719 ## File path: core/src/main/java/org/apache/accumulo/core/client/summary/summarizers/EntryLengthSummarizer.java ## @@ -0,0 +1,329 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License");you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.accumulo.core.client.summary.summarizers; + +import java.math.RoundingMode; + +import org.apache.accumulo.core.client.summary.Summarizer; +import org.apache.accumulo.core.client.summary.SummarizerConfiguration; +import org.apache.accumulo.core.data.Key; +import org.apache.accumulo.core.data.Value; + +import com.google.common.math.IntMath; + +/** + * Summarizer that computes summary information about field lengths. + * Specifically key length, row length, family length, qualifier length, visibility length, and value length. + * Incrementally computes minimum, maximum, count, sum, and log2 histogram of the lengths. + */ +public class EntryLengthSummarizer implements Summarizer { + + public static final String MIN_KEY_STAT = "minKey"; + public static final String MAX_KEY_STAT = "maxKey"; + public static final String SUM_KEYS_STAT = "sumKeys"; + + public static final String MIN_ROW_STAT = "minRow"; + public static final String MAX_ROW_STAT = "maxRow"; + public static final String SUM_ROWS_STAT = "sumRows"; + + public static final String MIN_FAMILY_STAT = "minFamily"; + public static final String MAX_FAMILY_STAT = "maxFamily"; + public static final String SUM_FAMILIES_STAT = "sumFamilies"; + + public static final String MIN_QUALIFIER_STAT = "minQualifier"; + public static final String MAX_QUALIFIER_STAT = "maxQualifier"; + public static final String SUM_QUALIFIERS_STAT = "sumQualifiers"; + + public static final String MIN_VISIBILITY_STAT = "minVisibility"; + public static final String MAX_VISIBILITY_STAT = "maxVisibility"; + public static final String SUM_VISIBILITIES_STAT = "sumVisibilities"; + + public static final String MIN_VALUE_STAT = "minValue"; + public static final String MAX_VALUE_STAT = "maxValue"; + public static final String SUM_VALUES_STAT = "sumValues"; + + public static final String TOTAL_STAT = "total";// Total number of Keys + Review comment: It would be nice to reduce the redundant code. One possible way to do this is to create an internal class like the following. ```java private static class LengthStats { private long min = Long.MAX_VALUE; ? private long max = Long.MIN_VALUE; ? private long sum = 0; ? private long[] counts = new long[32]; private void accept(int lenght) { int idx; ? ? if (length < minKey) { ? min = length; ? } ? ? if (length > max) { ? maxKey = length; ? } ? ? sum += length; ? ? if (length == 0) { ? idx = 0; ? } else { ? idx = IntMath.log2(length, RoundingMode.HALF_UP); ? } ? ? counts[idx]++; } void summarize(String prefix, SummaryConsumer sc) { sc.accept(prefix+"min", (min != Long.MAX_VALUE ? min:0)); ? sc.accept(prexif+"max", (max != Long.MIN_VALUE ? max:0)); ? sc.accept(prefix+"sum", sum); ? ? for (int i = 0;i < counts.length;i++) { ? if(counts[i] > 0) { ?sc.accept(prefix+".logHist."+i, counts[i]); ? } ? } } } ``` Then create an instance of this class for each field and use it in the Collectors `accept` and `summarize` methods. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (ACCUMULO-4730) Create an Entry length summarizer
[ https://issues.apache.org/jira/browse/ACCUMULO-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4730: - Labels: newbie pull-request-available (was: newbie) > Create an Entry length summarizer > - > > Key: ACCUMULO-4730 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4730 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Assignee: Jared R > Labels: newbie, pull-request-available > Fix For: 2.0.0 > > > It would be very useful to have a built in > [Summarizer|https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/summary/Summarizer.java] > that computes summary information about field lengths. Specifically key > length, row length, family length, qualifier length, visibility length, and > value length. Whatever stats are computed must be able to computed > incrementally. For example can incrementally compute min, max, count, sum, > and log2 histogram. I think these would be good stats to start with. Count > and sum can be used to compute the average. There is an example of computing > a log2 histogram in the Summarizer javadoc. > The Summarizer could be named EntryLenghtSummarizer and possibly produce > summaries like the following. > {noformat} > count=XXX //do not need to track this per field, its the same for all > key.min=XXX > key.max=XXX > key.sum=XXX > key.logHist.8=XXX //only output non zero exponents > key.logHist.9=XXX > row.min=XXX > row.max=XXX > row.sum=XXX > row.logHist.7=XXX > row.logHist.8=XXX > row.logHist.10=XXX > family.min=XXX > family.max=XXX > family.sum=XXX > family.logHist.6=XXX > family.logHist.7=XXX > etc... > {noformat} > This new summarizer would be placed in the > [summarizers|https://github.com/apache/accumulo/tree/master/core/src/main/java/org/apache/accumulo/core/client/summary/summarizers] > package. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] ctubbsii commented on issue #320: ACCUMULO-4730 Created EntryLengthSummarizer
ctubbsii commented on issue #320: ACCUMULO-4730 Created EntryLengthSummarizer URL: https://github.com/apache/accumulo/pull/320#issuecomment-342963831 Test comment. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ctubbsii commented on issue #320: ACCUMULO-4730 Created EntryLengthSummarizer
ctubbsii commented on issue #320: ACCUMULO-4730 Created EntryLengthSummarizer URL: https://github.com/apache/accumulo/pull/320#issuecomment-342963831 Test comment. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers
mikewalch commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#discussion_r149796784 ## File path: pages/INSTALL.md ## @@ -0,0 +1,1182 @@ + + + + + + + + Review comment: What is the purpose of this file? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers
mikewalch commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#discussion_r149792570 ## File path: contributor/build.md ## @@ -0,0 +1,82 @@ +--- +title: Building Accumulo Review comment: I put documentation in the old `source.md` into this doc. You can move this back into `contributors-guide.md` if you prefer. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers
mikewalch commented on a change in pull request #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#discussion_r149792297 ## File path: _includes/nav.html ## @@ -24,6 +24,8 @@ Documentation + Quickstart (1.x) + Quickstart (2.x) Review comment: The 2.x quickstart is on the website in the 2.0 docs at https://accumulo.apache.org/docs/2.0/getting-started/quick-install It would be better to link to the URL above. However, I am not sure about linking to unreleased docs yet. It helps contributors but can confuse users. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on issue #37: Accumulo 4714 Create landing page for new developers
mikewalch commented on issue #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#issuecomment-342901759 @jmark99, Ok great. Feel free to copy/commit my contactus.md & how-to-contribute.md page into your PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jmark99 commented on issue #37: Accumulo 4714 Create landing page for new developers
jmark99 commented on issue #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#issuecomment-342895558 @mikewalch, sounds good. I'll start updating the current pull request to go in this direction. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on issue #37: Accumulo 4714 Create landing page for new developers
mikewalch commented on issue #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#issuecomment-342895102 @jmark99, I was going to merge `get_involved` and `mailing_list` into a [contactus](https://github.com/mikewalch/accumulo-website/blob/640b260ae411ef86b136864e3a8284d4bcce0fcd/pages/contact-us.md) page. Btw, I am just copying what is being done on the [Fluo website](https://fluo.apache.org) which seems to be working as Fluo is getting more first-time contributors than Accumulo. I think this is due to simple instructions. We could limit the pages to `contactus` , `how-to-contribute` and `contributor-guide`. Things that aren't necessary for the minimal `contactus` & `how-to-contribute` like lazy consensus, voting, etc could go into the `contributor-guide` page. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jmark99 commented on issue #37: Accumulo 4714 Create landing page for new developers
jmark99 commented on issue #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#issuecomment-342879256 @mikewalch , do you envision us keeping the 'get_involved' page in the drop down?. Since you address some ideas for getting involved I'm not sure it needs to be a separate drop down selections. It contains some information that probably isn't needed for someone just looking to get involved,i.e., lazy consensus info, voting, etc. Any information on that page that is not already in the contributor guide could be added and the get_involved could be replaced with your "how to contribute" page. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] milleruntime commented on issue #37: Accumulo 4714 Create landing page for new developers
milleruntime commented on issue #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#issuecomment-342873531 FYI I'd like to review finished product of whatever you guys come up with. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mikewalch commented on issue #37: Accumulo 4714 Create landing page for new developers
mikewalch commented on issue #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#issuecomment-342873536 @jmark99, I like your idea. We could add the following text to the top and bottom of my page and link to your page. ``` This page provides basic instructions for contributing to Accumulo. If you are looking for more information, check out the more comprehensive [contributor guide](/contributor-guide/). ``` Contributors could start with the `how-to-contribute.md` page that I created (which could probably even be shortened some more) and then reference your guide if they need more information. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jkrdev opened a new pull request #320: Accumulo-4730 Created EntryLengthSummarizer
jkrdev opened a new pull request #320: Accumulo-4730 Created EntryLengthSummarizer URL: https://github.com/apache/accumulo/pull/320 Built in Summarizer that computes summary information about field lengths. Specifically key length, row length, family length, qualifier length, visibility length, and value length. Incrementally computes min, max, count, sum, and log2 histogram. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] PircDef commented on a change in pull request #319: ACCUMULO-4737 Clean up cipher algorithm configuration
PircDef commented on a change in pull request #319: ACCUMULO-4737 Clean up cipher algorithm configuration URL: https://github.com/apache/accumulo/pull/319#discussion_r149690043 ## File path: core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModuleFactory.java ## @@ -239,14 +239,6 @@ public CryptoModuleParameters initializeCipher(CryptoModuleParameters params) { } - public static String[] parseCipherTransform(String cipherTransform) { -if (cipherTransform == null) { - return new String[3]; -} - -return cipherTransform.split("/"); Review comment: I don't see any more splits in the crypto code This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jmark99 commented on issue #37: Accumulo 4714 Create landing page for new developers
jmark99 commented on issue #37: Accumulo 4714 Create landing page for new developers URL: https://github.com/apache/accumulo-website/pull/37#issuecomment-342826688 @mikewalch, That was a concern as I was working on it. It slowly kept growing as the various bits of information were being collected. I actually like the work you have done quite a lot. As a new contributor I think one is most interested in getting the basic info needed to get them started. Your page does a great job of that. The hope with the submitted layout was that a user could click on the topic that they were interested in and not bother with the additional information but in the end the page did appear to be a little overwhelming. (Information on git workflow, etc, would not be that relevant to a new user initially). I would be good with using your page as the basis for what a new contributor would need to do to get up and running and then find the appropriate way to link to the additional information. Perhaps something like having the site point to your page as the one new developers would reference for initial contribution guidance and then use the page I created to serve as an all-inclusive contributor manual or something like that (renaming it from index.md of course). That way we could add details as needed to the manual since the role of the page would be to serve as a comprehensive repository of contributor-related information. Thoughts? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services