Oy, that really did not come across well for me in email-form. Can you use paste.a.o or something?

+1 for moving "internal-only" iterators and IteratorUtils. Neither are things that we intend for users to need.

IMO, IteratorEnvironment was also kind of hokey/goofy (I never really used it either). I could go either way on it.

Keith Turner wrote:
I modified the apilyzer config in the 1.8 branch to include iterators.   It
spit out the following problems with types.

I think we could move IteratorUtil out of
org.apache.accumulo.core.iterators.  Also we could possibly move the
iterators in org.apache.accumulo.core.iterators.system somewhere else (or
declare it as an exception).  Could possibly deprecate IteratorEnvironment.
getConfig() and add a replacement method that returns a Map.

Problems :

   CONTEXT
TYPE
FIELD/METHOD                        NON-PUBLIC REFERENCE

   METHOD_RETURN
org.apache.accumulo.core.iterators.IteratorEnvironment
getConfig(...)
org.apache.accumulo.core.conf.AccumuloConfiguration
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
parseIterConf(...)
org.apache.accumulo.core.conf.AccumuloConfiguration
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.data.impl.KeyExtent
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.conf.AccumuloConfiguration
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.data.impl.KeyExtent
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.conf.AccumuloConfiguration
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.data.impl.KeyExtent
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.conf.AccumuloConfiguration
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.data.impl.KeyExtent
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.conf.AccumuloConfiguration
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.data.impl.KeyExtent
   METHOD_PARAM
org.apache.accumulo.core.iterators.IteratorUtil
loadIterators(...)
org.apache.accumulo.core.conf.AccumuloConfiguration
   CTOR_PARAM
org.apache.accumulo.core.iterators.system.MapFileIterator
(...)
org.apache.accumulo.core.conf.AccumuloConfiguration
   METHOD_RETURN
org.apache.accumulo.core.iterators.system.MapFileIterator
getSample(...)
org.apache.accumulo.core.file.FileSKVIterator
   METHOD_PARAM
org.apache.accumulo.core.iterators.system.MapFileIterator
getSample(...)
org.apache.accumulo.core.sample.impl.SamplerConfigurationImpl
   CTOR_PARAM
org.apache.accumulo.core.iterators.system.MultiIterator
(...)
org.apache.accumulo.core.data.impl.KeyExtent
   CTOR_PARAM
org.apache.accumulo.core.iterators.system.SampleIterator
(...)                               org.apache.accumulo.core.sample.Sampler
   CTOR_PARAM
org.apache.accumulo.core.iterators.system.SequenceFileIterator
(...)                               org.apache.hadoop.io.SequenceFile$Reader
   METHOD_RETURN
org.apache.accumulo.core.iterators.system.SequenceFileIterator
getSample(...)
org.apache.accumulo.core.file.FileSKVIterator
   METHOD_PARAM
org.apache.accumulo.core.iterators.system.SequenceFileIterator
getSample(...)
org.apache.accumulo.core.sample.impl.SamplerConfigurationImpl
   FIELD
org.apache.accumulo.core.iterators.system.VisibilityFilter
cache
org.apache.commons.collections.map.LRUMap
   FIELD
org.apache.accumulo.core.iterators.user.TransformingIterator
log                                 org.slf4j.Logger

Total : 22


On Mon, May 9, 2016 at 4:54 PM, Christopher<[email protected]>  wrote:

Hey Keith. Just wanted to ping you on this to see if you've had a chance to
look.

On Fri, Mar 25, 2016 at 12:17 PM Keith Turner<[email protected]>  wrote:

On Thu, Mar 24, 2016 at 8:27 PM, Christopher<[email protected]>
wrote:
It seems there's a general agreement that we treat it like public API
and
we should just call it public API.

I just don't know how we're going to actually say this is public API,
without addressing the issue of the other iterators in the same
package,
unless we're okay with going back to a more complicated API definition
which calls out specific classes instead of whole packages.

Keith, you spent the most time thinking about how to convey the public
API
in the README. What do you think about how to actually make this
happen?

I will look into it at the end of next week and try to come up with a
strategy for incorporating it into the public API.  Need to analyze what
types are used by the existing iterators.  In 1.7.0 I tried to make API
types only reference other API types.  Things that violated this type
constraint were deprecated and a replacement that met the type constraint
was introduced.  Maybe we can follow this path with iterators in 1.8.0,
not
sure.


Also, what would we want to say about the 1.6 and 1.7 branches API?
Maybe
if we can guarantee that there aren't any changes in those
classes/packages, we can just add them to the README as a new
declaration
of API (but not actually an addition in the semver sense), but if there
are
any changes, that's going to complicate things, because semver
guarantees
should mean no public API changes between bugfix releases. I guess I
should
actually check to see if there have been any changes...

On Thu, Mar 24, 2016 at 7:15 PM Keith Turner<[email protected]>  wrote:

On Thu, Mar 24, 2016 at 5:54 PM, Billie Rinaldi<
[email protected]>
wrote:

On Thu, Mar 24, 2016 at 1:15 PM, Christopher<[email protected]>
wrote:
We do have the opportunity to move to a new improved API, if
somebody
were
to put time into it. I guess that's true whether we put this in
the
public
API officially or not.

Agreed.  Even if we do create a new API, we can't change or drop
the
existing API without breaking a lot of people's code.  I feel like
SKVI
is
in a category of things that we treat as though they're in the
public
API,
so we might as well say it is.

+1 to that

Ensuring what exists is stable is a separate issue from creating a
new
iterator API.



I think maybe the hardest part is that we don't
really want to put just the interface in the API... but it exists
in
a
package with a bunch of other classes which probably shouldn't be
public
API. So, some thought needs to be put into *how* we're going to
do
it,
too.
On Thu, Mar 24, 2016 at 3:27 PM William Slacum<
[email protected]>
wrote:
It should be public API. It's one of the core reasons for
choosing
Accumulo
over a similar project like HBase or Cassandra. Allegedly, Jeff
"Mean
Gene"
Dean said we got the concept correct as well :)

Personally I hate the current API from a usability standpoint
(ie,
the
generic types are useless and already encoded in the name, it
needlessly
diverges from the standard java Iterator calling standards),
but
it's a
strong, identifying feature we have.

On Thu, Mar 24, 2016 at 2:50 PM, Christopher<
[email protected]>
wrote:
Accumulators,

What are the pros and cons that you can see for moving the
SortedKeyValueIterator into the public API?

Right now, I think there's still some need for improvement in
the
Iterator
API, and many of the iterators may not be stable enough to
really
recommend
people use without some serious caveats (because we may not
be
able
to
keep
their API stable very easily). So, there's a con.

In the pros side, iterators are a core feature of Accumulo,
and
nearly
all
of Accumulo's distributed processing capabilities are
dependent
upon
them.
It is reasonable to expect users to take advantage of them,
and
we've
at
least tried to be cautious about changing the iterators in
incompatible
ways, even if they aren't in the public API.

Recently, this came up when we stripped out all the
non-public
API
javadocs
from the website. (reported by Dan Blum on the user list on
March
4th:

http://mail-archives.apache.org/mod_mbox/accumulo-user/201603.mbox/%3C066a01d17658%24bc9dc1b0%2435d94510%24%40bbn.com%3E
)

What would it take for us to feel comfortable moving them to
the
public
API? Do we need a better interface first, or should we
isolate
the
other
iterators into another package (some of that has already been
done),
or
should we wait for a proper public API package (2.0?) to
provide
this
interface in?


Reply via email to