Re: which classes/methods are considered as private in Spark?
I used to, before each release during the RC phase, go through every single doc page to make sure we don’t unintentionally leave things public. I no longer have time to do that unfortunately. I find that very useful because I always catch some mistakes through organic development. > On Nov 13, 2018, at 8:00 PM, Wenchen Fan wrote: > > > Could you clarify what you mean here? Mima has some known limitations such > > as not handling "private[blah]" very well > > Yes that's what I mean. > > What I want to know here is, which classes/methods we expect them to be > private. I think things marked as "private[blabla]" are expected to be > private for sure, it's just the MiMa and doc generator can't handle it well. > We can fix them later, by using the @Private annotation probably. > > > seems like it's tracked by a bunch of exclusions in the Unidoc object > > That's good. At least we have a clear definition about which packages are > meant to be private. We should make it consistent between MiMa and doc > generator though. > >> On Wed, Nov 14, 2018 at 10:41 AM Marcelo Vanzin wrote: >> On Tue, Nov 13, 2018 at 6:26 PM Wenchen Fan wrote: >> > Recently I updated the MiMa exclusion rules, and found MiMa tracks some >> > private classes/methods unexpectedly. >> >> Could you clarify what you mean here? Mima has some known limitations >> such as not handling "private[blah]" very well (because that means >> public in Java). Spark has (had?) this tool to generate an exclusions >> file for Mima, but not sure how up-to-date it is. >> >> > AFAIK, we have several rules: >> > 1. everything which is really private that end users can't access, e.g. >> > package private classes, private methods, etc. >> > 2. classes under certain packages. I don't know if we have a list, the >> > catalyst package is considered as a private package. >> > 3. everything which has a @Private annotation. >> >> That's my understanding of the scope of the rules. >> >> (2) to me means "things that show up in the public API docs". That's, >> AFAIK, tracked in SparkBuild.scala; seems like it's tracked by a bunch >> of exclusions in the Unidoc object (I remember that being different in >> the past). >> >> (3) might be a limitation of the doc generation tool? Not sure if it's >> easy to say "do not document classes that have @Private". At the very >> least, that annotation seems to be missing the "@Documented" >> annotation, which would make that info present in the javadoc. I do >> not know if the scala doc tool handles that. >> >> -- >> Marcelo
Re: which classes/methods are considered as private in Spark?
> Could you clarify what you mean here? Mima has some known limitations such as not handling "private[blah]" very well Yes that's what I mean. What I want to know here is, which classes/methods we expect them to be private. I think things marked as "private[blabla]" are expected to be private for sure, it's just the MiMa and doc generator can't handle it well. We can fix them later, by using the @Private annotation probably. > seems like it's tracked by a bunch of exclusions in the Unidoc object That's good. At least we have a clear definition about which packages are meant to be private. We should make it consistent between MiMa and doc generator though. On Wed, Nov 14, 2018 at 10:41 AM Marcelo Vanzin wrote: > On Tue, Nov 13, 2018 at 6:26 PM Wenchen Fan wrote: > > Recently I updated the MiMa exclusion rules, and found MiMa tracks some > private classes/methods unexpectedly. > > Could you clarify what you mean here? Mima has some known limitations > such as not handling "private[blah]" very well (because that means > public in Java). Spark has (had?) this tool to generate an exclusions > file for Mima, but not sure how up-to-date it is. > > > AFAIK, we have several rules: > > 1. everything which is really private that end users can't access, e.g. > package private classes, private methods, etc. > > 2. classes under certain packages. I don't know if we have a list, the > catalyst package is considered as a private package. > > 3. everything which has a @Private annotation. > > That's my understanding of the scope of the rules. > > (2) to me means "things that show up in the public API docs". That's, > AFAIK, tracked in SparkBuild.scala; seems like it's tracked by a bunch > of exclusions in the Unidoc object (I remember that being different in > the past). > > (3) might be a limitation of the doc generation tool? Not sure if it's > easy to say "do not document classes that have @Private". At the very > least, that annotation seems to be missing the "@Documented" > annotation, which would make that info present in the javadoc. I do > not know if the scala doc tool handles that. > > -- > Marcelo >
Re: which classes/methods are considered as private in Spark?
On Tue, Nov 13, 2018 at 6:26 PM Wenchen Fan wrote: > Recently I updated the MiMa exclusion rules, and found MiMa tracks some > private classes/methods unexpectedly. Could you clarify what you mean here? Mima has some known limitations such as not handling "private[blah]" very well (because that means public in Java). Spark has (had?) this tool to generate an exclusions file for Mima, but not sure how up-to-date it is. > AFAIK, we have several rules: > 1. everything which is really private that end users can't access, e.g. > package private classes, private methods, etc. > 2. classes under certain packages. I don't know if we have a list, the > catalyst package is considered as a private package. > 3. everything which has a @Private annotation. That's my understanding of the scope of the rules. (2) to me means "things that show up in the public API docs". That's, AFAIK, tracked in SparkBuild.scala; seems like it's tracked by a bunch of exclusions in the Unidoc object (I remember that being different in the past). (3) might be a limitation of the doc generation tool? Not sure if it's easy to say "do not document classes that have @Private". At the very least, that annotation seems to be missing the "@Documented" annotation, which would make that info present in the javadoc. I do not know if the scala doc tool handles that. -- Marcelo - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: which classes/methods are considered as private in Spark?
You should find that 'surprisingly public' classes are there because of language technicalities. For example DummySerializerInstance is public because it's a Java class, and can't be used outside its package otherwise. LIkewise I think MiMa just looks at bytecode, and private[spark] classes are public in the bytecode for similar reasons (although Scala enforces the access within Scala as expected). Hence it will flag changes to "nonpublic" private[spark] classes. I think things that are meant to be marked private are, well, marked private, or else as private as possible and flagged with annotations like @Private. (It does sound like DummySerializerInstance should be so annotated?) Yes, the catalyst package in its entirety is one big exception - private by fiat, not by painstaking flagging of every class. The issue to me is really docs. If we have java/scaladoc of private classes, and there's a way to avoid that like with annotations, that should be fixed. On Tue, Nov 13, 2018 at 6:26 PM Wenchen Fan wrote: > > Hi all, > > Recently I updated the MiMa exclusion rules, and found MiMa tracks some > private classes/methods unexpectedly. > > Note that, "private" here means that, we have no guarantee about > compatibility. We don't provide documents and users need to take the risk > when using them. > > In the API document, it has some obvious private classes, e.g. > https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.serializer.DummySerializerInstance > , which is not expected either. > > I looked around and can't find a clear definition of "private" in Spark. > > AFAIK, we have several rules: > 1. everything which is really private that end users can't access, e.g. > package private classes, private methods, etc. > 2. classes under certain packages. I don't know if we have a list, the > catalyst package is considered as a private package. > 3. everything which has a @Private annotation. > > I'm sending this email to collect more feedback, and hope we can come up with > a clear definition about what is "private". > > Thanks, > Wenchen - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org