stevedlawrence commented on code in PR #1158:
URL: https://github.com/apache/daffodil/pull/1158#discussion_r1488005307
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/api/JPrimType.java:
##########
@@ -14,16 +14,29 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+package org.apache.daffodil.runtime1.api;
-package org.apache.daffodil.runtime1.dpath
+public enum JPrimType {
-import org.junit.Assert.assertTrue
-import org.junit.Test
-
-class TestNodeInfo {
-
- @Test def test_nodeInfoCanBeConstructed(): Unit = {
- assertTrue(NodeInfo.isXDerivedFromY("Byte", "Long"))
- }
+ String,
+ Int,
+ Byte,
+ Short,
+ Long,
+ Integer,
+ Decimal,
+ UnsignedInt,
+ UnsignedByte,
+ UnsignedShort,
+ UnsignedLong,
+ NonNegativeInteger,
Review Comment:
The java world doesn't have concepts of these unsigned types. Does it make
sense for the public API to hide these, so that API users don't need to be
aware of things like if the JPrimType is `UnsignedShort`, then the actual type
is `Int`? Or do we just have documntation that says if JPrimType is X, then the
actual type that we return is Y, which may or may not be the same as X.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/infoset/JDOMInfosetOutputter.scala:
##########
@@ -56,7 +56,7 @@ class JDOMInfosetOutputter extends InfosetOutputter {
if (!simple.isNilled) {
val text =
- if (simple.metadata.primitiveType == PrimitiveType.String) {
+ if (simple.metadata.jPrimType == JPrimType.String) {
Review Comment:
Personally, I think `primitiveType` and `PrimitiveType` make for a better
API name/type than `jPrimType`/`JPrimType`.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/dpath/NodeInfo.scala:
##########
@@ -298,36 +279,36 @@ object NodeInfo extends Enum {
def fromObject(a: Any) = {
a match {
- case x: String => NodeInfo.String
- case x: Int => NodeInfo.Int
- case x: Byte => NodeInfo.Byte
- case x: Short => NodeInfo.Short
- case x: Long => NodeInfo.Long
- case x: JBigInt => NodeInfo.Integer
- case x: JBigDecimal => NodeInfo.Decimal
- case x: Double => NodeInfo.Double
- case x: Float => NodeInfo.Float
- case x: Array[Byte] => NodeInfo.HexBinary
- case x: URI => NodeInfo.AnyURI
- case x: Boolean => NodeInfo.Boolean
- case x: DFDLCalendar => NodeInfo.DateTime
+ case x: String => PrimType.String
+ case x: Int => PrimType.Int
+ case x: Byte => PrimType.Byte
+ case x: Short => PrimType.Short
+ case x: Long => PrimType.Long
+ case x: JBigInt => PrimType.Integer
+ case x: JBigDecimal => PrimType.Decimal
+ case x: Double => PrimType.Double
+ case x: Float => PrimType.Float
+ case x: Array[Byte] => PrimType.HexBinary
+ case x: URI => PrimType.AnyURI
+ case x: Boolean => PrimType.Boolean
+ case x: DFDLCalendar => PrimType.DateTime
case _ => Assert.usageError("Unsupported object representation type:
%s".format(a))
}
}
def fromClass(jc: Class[_]) = {
val ni = jc match {
- case ClassIntBoxed | ClassIntPrim => Some(NodeInfo.Int)
- case ClassByteBoxed | ClassBytePrim => Some(NodeInfo.Byte)
- case ClassShortBoxed | ClassShortPrim => Some(NodeInfo.Short)
- case ClassLongBoxed | ClassLongPrim => Some(NodeInfo.Long)
- case ClassDoubleBoxed | ClassDoublePrim => Some(NodeInfo.Double)
- case ClassFloatBoxed | ClassFloatPrim => Some(NodeInfo.Float)
- case ClassBooleanBoxed | ClassBooleanPrim => Some(NodeInfo.Boolean)
- case ClassString => Some(NodeInfo.String)
- case ClassJBigInt => Some(NodeInfo.Integer)
- case ClassJBigDecimal => Some(NodeInfo.Decimal)
- case ClassPrimByteArray => Some(NodeInfo.HexBinary)
+ case ClassIntBoxed | ClassIntPrim => Some(PrimType.Int)
+ case ClassByteBoxed | ClassBytePrim => Some(PrimType.Byte)
+ case ClassShortBoxed | ClassShortPrim => Some(PrimType.Short)
+ case ClassLongBoxed | ClassLongPrim => Some(PrimType.Long)
+ case ClassDoubleBoxed | ClassDoublePrim => Some(PrimType.Double)
+ case ClassFloatBoxed | ClassFloatPrim => Some(PrimType.Float)
+ case ClassBooleanBoxed | ClassBooleanPrim => Some(PrimType.Boolean)
+ case ClassString => Some(PrimType.String)
+ case ClassJBigInt => Some(PrimType.Integer)
+ case ClassJBigDecimal => Some(PrimType.Decimal)
+ case ClassPrimByteArray => Some(PrimType.HexBinary)
case _ => None
}
Review Comment:
This function has the same concern with fromObject. Is this something that
should also be removed? It looks like it's only use in UDF's, which because
they are implemented in Java do not support unsigned types and calendars, so
maybe this is fine. But we might need a comment as to why this ony supports
java primitives and not all DFDL primitives.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/api/JPrimType.java:
##########
Review Comment:
Git doesn't have a concept of "rename". All it knows is one file is deleted
and another file is added. But to be helpful, some git commands check if a
deleted file is "similar enough" to a new file, and if so it considers it a
rename with minor changes.
In thins case, both TestNodenfo.scala and JPrimType.java are small enough
that the matching license header meats that similar enough threshold.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/dpath/NodeInfo.scala:
##########
@@ -298,36 +279,36 @@ object NodeInfo extends Enum {
def fromObject(a: Any) = {
a match {
- case x: String => NodeInfo.String
Review Comment:
Is this even used? Seems like getting the PrimType of an object might be
wrong. For example, the object might be a "Long", but the correct primitive
type might be an `PrimType.UnsignedInt`. If this is used, I wonder if there are
possible bugs where we aren't using the correct primitive type? Feels like this
function might want to be removed for being potentially buggy?
In fact, codecov says it's not used. So we probably don't have a bug, but it
should probably be removed.
##########
daffodil-lib/src/main/scala/org/apache/daffodil/lib/util/Delay.scala:
##########
@@ -65,7 +65,7 @@ import org.apache.daffodil.lib.exceptions.Assert
*
* }}}
*/
-final class Delay[T] private (private var box: Delay.Box[T], sym: Symbol)
+final class Delay[T] private (private var box: Delay.Box[T], sym: AnyRef)
Review Comment:
Can you explain this `Symbol` to `AnyRef` change? We are still using and
passing in a `Symbol` as the parameter type, so it seems we're just losing type
information without much gain. And it looks like `SchemaSet` and `OOLAG` still
reference the `Symbol` type, so we haven't gotten rid of it completely. Should
we hold off on this change and fix all `Symbol` stuff at once as part of
upgrading to 2.13/3.0?
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/dpath/NodeInfo.scala:
##########
@@ -85,16 +82,16 @@ sealed abstract class TypeNode private (
) extends Serializable
with NodeInfo.Kind {
- def this(sym: Symbol, parents: => Seq[NodeInfo.Kind], children: =>
Seq[NodeInfo.Kind]) =
Review Comment:
Sounds like it was deprecated in 2.13 and removed in 3.0:
https://docs.scala-lang.org/scala3/guides/migration/incompat-dropped-features.html#symbol-literals
Mentioned in another comment, but maybe we should hold off on Symbol changes
until we upgrade to 2.13 an 3.0, we still widely use Symbol. According to that
link, it recommends to just replace Symbols with plain string literals. Though,
maybe there's a better alternative in our case.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/dpath/NodeInfo.scala:
##########
@@ -509,53 +491,28 @@ object NodeInfo extends Enum {
val Date = PrimType.Date
val Time = PrimType.Time
- /**
- * This list of types must be in order of most specific to least, i.e. Byte
- * inherits from Short, which inherits from Int etc. This is becasue
- * fromNodeInfo does a find on this list based on isSubtypeOf, which will
- * return the first successful result.
- */
- val allPrims = List(
- String,
- Byte,
- Short,
- Int,
- Long,
- UnsignedByte,
- UnsignedShort,
- UnsignedInt,
- UnsignedLong,
- NonNegativeInteger,
- Integer,
- Float,
- Double,
- Decimal,
- HexBinary,
- AnyURI,
- Boolean,
- DateTime,
- Date,
- Time,
- )
+ protected sealed trait PrimTypeKind extends AnyAtomic.Kind
/**
* The PrimType objects are a child enum within the overall NodeInfo
* enum.
*/
object PrimType {
+ type Kind = PrimTypeKind
- def fromRefQName(refQName: RefQName): Option[PrimType] = {
- allPrims.find { prim => refQName.matches(prim.globalQName) }
- }
+ private lazy val nameToPrimType: Map[String, PrimType] =
+ (new LinkedHashMap() ++= allDFDLTypes.map { ptn =>
+ (ptn.name, ptn)
+ }).toMap // to immutable
Review Comment:
Minor, but if we are calling `toMap` to make it immutable I believe we lose
the deterministic ordering of a `LinkedHashMap`. This is probably
fine--`allDFDLTypes` can be used if determinism is needed. And in that case, it
might be more clear to write it like:
```scala
allDFDLTypes.map { ptn => (ptn.name, ptn) }.toMap
```
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/dpath/NodeInfo.scala:
##########
@@ -198,58 +199,39 @@ object NodeInfo extends Enum {
}
def isError: Boolean = false
- def primType = this
+
+ private lazy val optPrimType_ = Some(this)
+ override def optPrimType: Option[PrimType] = optPrimType_
def fromXMLString(s: String): DataValuePrimitive
override def toString = name
}
- private def getTypeNode(name: String) = {
- val namelc = name.toLowerCase()
- allTypes.find(stn => stn.lcaseName == namelc)
- }
-
- /**
- * For Java API use, we have a very restricted trait api.PrimitiveType
- * mixed into PrimTypeNode, so that we can hand PrimTypeNode as result
- * from methods callable from Java without exposing all of PrimType's
- * implementation.
- * @param name lookup key, case insensitive
- * @return an api.PrimitiveType, or null if there is no type with that name.
- */
- def primitiveTypeFromName(name: String): PrimitiveType = {
- allDFDLTypesLookupTable.get(name)
- }
-
- def isXDerivedFromY(nameX: String, nameY: String): Boolean = {
- if (nameX == nameY) true
- else {
- getTypeNode(nameX) match {
- case Some(stn) => {
- stn.doesParentListContain(nameY)
- }
- case None => false
- }
- }
- }
-
sealed trait Kind extends EnumValueType with PrimTypeView {
def name: String = Misc.getNameFromClass(this)
def parents: Seq[Kind]
def children: Seq[Kind]
- final lazy val isHead: Boolean = parents.isEmpty
- final lazy val lcaseName = name.toLowerCase()
+ /**
+ * The prim type corresponding to this type.
+ * So if this Kind is derived from a PrimType kind, then
+ * this kind should return that prim type object.
+ *
+ * @return None if there is no corresponding prim type.
+ */
+ def optPrimType: Option[PrimType] = None
- final lazy val selfAndAllParents: Set[Kind] = parents.flatMap {
+ private final lazy val lcaseName = name.toLowerCase()
Review Comment:
Is there value in storing the lowercase version of this as a member? I feel
like memory is our big problem, everytime we add a new member we are taking up
just a little bit more memory. In fact, it's only used by `parentList`, and
that is only used by `doesParentListContain`, which just calls `toLowerCase`
again, so we aren't even saving any CPU by storing it. Probably a premature
optimization, but seems like `doesParentListContain` could be this:
```scala
def doesParentListContain(typeName: String): Boolean = {
selfAndParents.exists { _.name.equalsIngoreCase(typeName) }
}
```
Avoids both he lcaseName and parentList members, and I imagine is just as
fast.
All that said, I now see `doesParentListContain` is never even used, so I
suggest we just drop that function and the associated members it uses.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/api/Metadata.scala:
##########
@@ -155,58 +151,11 @@ trait ComplexElementMetadata extends ElementMetadata {
*/
trait SimpleElementMetadata extends ElementMetadata {
- def primitiveType: PrimitiveType
-}
-
-/**
- * Instances are static objects that represent the DFDL primitive types.
- */
-trait PrimitiveType {
- def name: String
-}
-
-/**
- * Static methods related to PrimitiveType objects
- */
-object PrimitiveType {
-
- private lazy val _list: java.util.List[PrimitiveType] =
- NodeInfo.allDFDLTypes.asInstanceOf[Seq[PrimitiveType]].asJava
-
- /**
- * Get a primitive type given a name string.
- *
- * @param name lookup key. Case insensitive.
- * @return the PrimitiveType with that name, or null if there is no such
primitive type.
- */
- def fromName(name: String): PrimitiveType =
- NodeInfo.primitiveTypeFromName(name)
-
/**
- * A list of all the primitive type objects.
+ * Primitive Type enum usable from Java
+ * @return
*/
- def list: java.util.List[PrimitiveType] = _list
-
- val String: PrimitiveType = PrimType.String
- val Int: PrimitiveType = PrimType.Int
- val Byte: PrimitiveType = PrimType.Byte
- val Short: PrimitiveType = PrimType.Short
- val Long: PrimitiveType = PrimType.Long
- val Integer: PrimitiveType = PrimType.Integer
- val Decimal: PrimitiveType = PrimType.Decimal
- val UnsignedInt: PrimitiveType = PrimType.UnsignedInt
- val UnsignedByte: PrimitiveType = PrimType.UnsignedByte
- val UnsignedShort: PrimitiveType = PrimType.UnsignedShort
- val UnsignedLong: PrimitiveType = PrimType.UnsignedLong
- val NonNegativeInteger: PrimitiveType = PrimType.NonNegativeInteger
- val Double: PrimitiveType = PrimType.Double
- val Float: PrimitiveType = PrimType.Float
- val HexBinary: PrimitiveType = PrimType.HexBinary
- val AnyURI: PrimitiveType = PrimType.AnyURI
- val Boolean: PrimitiveType = PrimType.Boolean
- val DateTime: PrimitiveType = PrimType.DateTime
- val Date: PrimitiveType = PrimType.Date
- val Time: PrimitiveType = PrimType.Time
+ def jPrimType: JPrimType
Review Comment:
For the public API, do we want to drop the `J` from the variable name and
enum? I would think any public API should assume Java
compatibility/friendliness, so we shouldn't need special types/names for it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]