[GitHub] incubator-hivemall pull request #95: [HIVEMALL-119] Fix type cast issues in ...

2017-07-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/incubator-hivemall/pull/95#discussion_r125930040
  
--- Diff: xgboost/src/main/java/hivemall/xgboost/XGBoostUDTF.java ---
@@ -269,44 +270,35 @@ public void checkTargetValue(double target) throws 
HiveException {}
 public void process(Object[] args) throws HiveException {
--- End diff --

Is it ok to just call `mvn formatter:format`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall pull request #95: [HIVEMALL-119] Fix type cast issues in ...

2017-07-06 Thread myui
Github user myui commented on a diff in the pull request:

https://github.com/apache/incubator-hivemall/pull/95#discussion_r125849451
  
--- Diff: xgboost/src/main/java/hivemall/xgboost/XGBoostUDTF.java ---
@@ -269,44 +274,33 @@ public void checkTargetValue(double target) throws 
HiveException {}
 public void process(Object[] args) throws HiveException {
 if (args[0] != null) {
 // TODO: Need to support dense inputs
-final List features = (List) 
featureListOI.getList(args[0]);
+final List features = (List) 
featureListOI.getList(args[0]);
+final String[] fv = new String[features.size()];
+for (int i = 0; i < features.size(); i++) {
+fv[i] = (String) 
featureElemOI.getPrimitiveJavaObject(features.get(i));
+}
 double target = 
PrimitiveObjectInspectorUtils.getDouble(args[1], this.targetOI);
 checkTargetValue(target);
-final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
features);
+final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
fv);
 if (point != null) {
 this.featuresList.add(point);
 }
 }
 }
 
-/**
- * Need to override this for a Spark wrapper because `MapredContext` 
does not work in there.
- */
-protected String generateUniqueModelId() {
-return "xgbmodel-" + String.valueOf(HadoopUtils.getTaskId());
+private String generateUniqueModelId() {
+return "xgbmodel-" + taskId + "-" + RandomUtils.getUUID() + "-" + 
sequence++;
--- End diff --

If one model per a task, `RandomUtils.getUUID()` is enough. 

It's synchronized and thus would be safe enough for Spark multiple thread 
execution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall pull request #95: [HIVEMALL-119] Fix type cast issues in ...

2017-07-06 Thread myui
Github user myui commented on a diff in the pull request:

https://github.com/apache/incubator-hivemall/pull/95#discussion_r125847385
  
--- Diff: xgboost/src/main/java/hivemall/xgboost/XGBoostUDTF.java ---
@@ -269,44 +270,35 @@ public void checkTargetValue(double target) throws 
HiveException {}
 public void process(Object[] args) throws HiveException {
 if (args[0] != null) {
 // TODO: Need to support dense inputs
-final List features = (List) 
featureListOI.getList(args[0]);
+final List features = (List) 
featureListOI.getList(args[0]);
+final String[] fv = new String[features.size()];
+for (int i = 0; i < features.size(); i++) {
+fv[i] = (String) 
featureElemOI.getPrimitiveJavaObject(features.get(i));
+}
 double target = 
PrimitiveObjectInspectorUtils.getDouble(args[1], this.targetOI);
 checkTargetValue(target);
-final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
features);
+final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
fv);
 if (point != null) {
 this.featuresList.add(point);
 }
 }
 }
 
-/**
- * Need to override this for a Spark wrapper because `MapredContext` 
does not work in there.
- */
-protected String generateUniqueModelId() {
-return "xgbmodel-" + String.valueOf(HadoopUtils.getTaskId());
+private String generateUniqueModelId() {
+return "xgbmodel-" + HadoopUtils.getUniqueTaskIdString();
 }
 
-private static Booster createXGBooster(final Map 
params,
-final List input) throws XGBoostError {
-try {
-Class[] args = {Map.class, DMatrix[].class};
-Constructor ctor;
-ctor = Booster.class.getDeclaredConstructor(args);
-ctor.setAccessible(true);
-return ctor.newInstance(new Object[] {params,
-new DMatrix[] {new DMatrix(input.iterator(), "")}});
-} catch (InstantiationException e) {
-// Catch java reflection error as fast as possible
-e.printStackTrace();
-} catch (IllegalAccessException e) {
-e.printStackTrace();
-} catch (InvocationTargetException e) {
-e.printStackTrace();
-} catch (NoSuchMethodException e) {
-e.printStackTrace();
-}
-// No one reach here
-return null;
+@Nonnull
+private static Booster createXGBooster(
+final Map params, final List 
input)
+throws NoSuchMethodException, XGBoostError, 
IllegalAccessException,
+InvocationTargetException, InstantiationException {
+Class[] args = {Map.class, DMatrix[].class};
+Constructor ctor;
+ctor = Booster.class.getDeclaredConstructor(args);
--- End diff --

Why not ?

`Constructor ctor  = Booster.class.getDeclaredConstructor(args);`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall pull request #95: [HIVEMALL-119] Fix type cast issues in ...

2017-07-06 Thread myui
Github user myui commented on a diff in the pull request:

https://github.com/apache/incubator-hivemall/pull/95#discussion_r125845977
  
--- Diff: xgboost/src/main/java/hivemall/xgboost/XGBoostUDTF.java ---
@@ -269,44 +274,33 @@ public void checkTargetValue(double target) throws 
HiveException {}
 public void process(Object[] args) throws HiveException {
 if (args[0] != null) {
 // TODO: Need to support dense inputs
-final List features = (List) 
featureListOI.getList(args[0]);
+final List features = (List) 
featureListOI.getList(args[0]);
+final String[] fv = new String[features.size()];
+for (int i = 0; i < features.size(); i++) {
+fv[i] = (String) 
featureElemOI.getPrimitiveJavaObject(features.get(i));
+}
 double target = 
PrimitiveObjectInspectorUtils.getDouble(args[1], this.targetOI);
 checkTargetValue(target);
-final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
features);
+final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
fv);
 if (point != null) {
 this.featuresList.add(point);
 }
 }
 }
 
-/**
- * Need to override this for a Spark wrapper because `MapredContext` 
does not work in there.
- */
-protected String generateUniqueModelId() {
-return "xgbmodel-" + String.valueOf(HadoopUtils.getTaskId());
+private String generateUniqueModelId() {
+return "xgbmodel-" + taskId + "-" + RandomUtils.getUUID() + "-" + 
sequence++;
--- End diff --

Ah, multiple models are created in a single task. So, `sequence` is 
required.

`taskId` could be useful just in case UUIDs are corrupting. So, okey as it 
is. 

So, `"xgbmodel-" + taskId + "-" + HadoopUtils.getUniqueTaskIdString() + "-" 
+ sequence++;"` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall pull request #95: [HIVEMALL-119] Fix type cast issues in ...

2017-07-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/incubator-hivemall/pull/95#discussion_r125845972
  
--- Diff: xgboost/src/main/java/hivemall/xgboost/XGBoostUDTF.java ---
@@ -326,7 +320,7 @@ public void close() throws HiveException {
 logger.info("model_id:" + modelId.toString() + " size:" + 
predModel.length);
 forward(new Object[] {modelId, predModel});
 } catch (Exception e) {
--- End diff --

It seems we can't cuz `close()` only throws `HiveException`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall pull request #95: [HIVEMALL-119] Fix type cast issues in ...

2017-07-06 Thread myui
Github user myui commented on a diff in the pull request:

https://github.com/apache/incubator-hivemall/pull/95#discussion_r125842906
  
--- Diff: xgboost/src/main/java/hivemall/xgboost/XGBoostUDTF.java ---
@@ -269,44 +274,33 @@ public void checkTargetValue(double target) throws 
HiveException {}
 public void process(Object[] args) throws HiveException {
 if (args[0] != null) {
 // TODO: Need to support dense inputs
-final List features = (List) 
featureListOI.getList(args[0]);
+final List features = (List) 
featureListOI.getList(args[0]);
+final String[] fv = new String[features.size()];
+for (int i = 0; i < features.size(); i++) {
+fv[i] = (String) 
featureElemOI.getPrimitiveJavaObject(features.get(i));
+}
 double target = 
PrimitiveObjectInspectorUtils.getDouble(args[1], this.targetOI);
 checkTargetValue(target);
-final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
features);
+final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
fv);
 if (point != null) {
 this.featuresList.add(point);
 }
 }
 }
 
-/**
- * Need to override this for a Spark wrapper because `MapredContext` 
does not work in there.
- */
-protected String generateUniqueModelId() {
-return "xgbmodel-" + String.valueOf(HadoopUtils.getTaskId());
+private String generateUniqueModelId() {
+return "xgbmodel-" + taskId + "-" + RandomUtils.getUUID() + "-" + 
sequence++;
 }
 
+@Nonnull
 private static Booster createXGBooster(final Map 
params,
-final List input) throws XGBoostError {
-try {
-Class[] args = {Map.class, DMatrix[].class};
-Constructor ctor;
-ctor = Booster.class.getDeclaredConstructor(args);
-ctor.setAccessible(true);
-return ctor.newInstance(new Object[] {params,
-new DMatrix[] {new DMatrix(input.iterator(), "")}});
-} catch (InstantiationException e) {
-// Catch java reflection error as fast as possible
-e.printStackTrace();
-} catch (IllegalAccessException e) {
-e.printStackTrace();
-} catch (InvocationTargetException e) {
-e.printStackTrace();
-} catch (NoSuchMethodException e) {
-e.printStackTrace();
-}
-// No one reach here
-return null;
+final List input) throws Exception {
--- End diff --

change `throws Exception`  to `throws XXXException, XGBoostError `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall pull request #95: [HIVEMALL-119] Fix type cast issues in ...

2017-07-06 Thread myui
Github user myui commented on a diff in the pull request:

https://github.com/apache/incubator-hivemall/pull/95#discussion_r125843584
  
--- Diff: xgboost/src/main/java/hivemall/xgboost/XGBoostUDTF.java ---
@@ -269,44 +274,33 @@ public void checkTargetValue(double target) throws 
HiveException {}
 public void process(Object[] args) throws HiveException {
 if (args[0] != null) {
 // TODO: Need to support dense inputs
-final List features = (List) 
featureListOI.getList(args[0]);
+final List features = (List) 
featureListOI.getList(args[0]);
+final String[] fv = new String[features.size()];
+for (int i = 0; i < features.size(); i++) {
+fv[i] = (String) 
featureElemOI.getPrimitiveJavaObject(features.get(i));
+}
 double target = 
PrimitiveObjectInspectorUtils.getDouble(args[1], this.targetOI);
 checkTargetValue(target);
-final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
features);
+final LabeledPoint point = XGBoostUtils.parseFeatures(target, 
fv);
 if (point != null) {
 this.featuresList.add(point);
 }
 }
 }
 
-/**
- * Need to override this for a Spark wrapper because `MapredContext` 
does not work in there.
- */
-protected String generateUniqueModelId() {
-return "xgbmodel-" + String.valueOf(HadoopUtils.getTaskId());
+private String generateUniqueModelId() {
+return "xgbmodel-" + taskId + "-" + RandomUtils.getUUID() + "-" + 
sequence++;
--- End diff --

Ah my bad. Better to  use `HadoopUtils.getUniqueTaskIdString()`. Could you 
change to it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hivemall pull request #95: [HIVEMALL-119] Fix type cast issues in ...

2017-07-06 Thread myui
Github user myui commented on a diff in the pull request:

https://github.com/apache/incubator-hivemall/pull/95#discussion_r125842738
  
--- Diff: xgboost/src/main/java/hivemall/xgboost/XGBoostUDTF.java ---
@@ -326,7 +320,7 @@ public void close() throws HiveException {
 logger.info("model_id:" + modelId.toString() + " size:" + 
predModel.length);
 forward(new Object[] {modelId, predModel});
 } catch (Exception e) {
--- End diff --

throw Exception directly without wrapping `e` in HiveException.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---