[
https://issues.apache.org/jira/browse/MADLIB-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Domino Valdano updated MADLIB-1332:
-----------------------------------
Description:
Currently, {{keras_evaluate()}} is implemented by calling
{{internal_keras_evaluate()}} as a UDF. This requires the validation table
passed to {{keras_fit()}} to be in a format with only 1 image per row, even
though the training table is in a different format, with a batch of images in
every row. This is potentially confusing and cumbersome for users to deal
with, and based on some preliminary testing it seems that passing only 1 image
at a time to keras_evaluate() is also slowing down performance.
We can solve this by converting {{internal_keras_evaluate()}} into a UDA, so
that it runs on a minibatched validation table in the same form as the training
table.
Tasks:
* Convert the {{internal_keras_evaluate}} UDF to a UDA and perform weighted
averaging of loss and accuracy.
* Since x and y will now be minibatched, we don't need to add another
dimension to {{x and y}} np arrays in {{internal_keras_evaluate}}.
* Compare UDF to UDA and verify that the UDA results in a speed improvement
was:
Currently, `keras_evaluate()` is implemented by calling
`internal_keras_evaluate() as a UDF. This requires the validation table passed
to `keras_fit()` to be in a format with only 1 image per row, even though the
training table is in a different format, with a batch of images in every row.
This is potentially confusing and cumbersome for users to deal with, and based
on some preliminary testing it seems that passing only 1 image at a time to
`keras_evaluate()` is also slowing down performance.
We can solve this by converting `internal_keras_evaluate()` into a UDA, so that
it runs on a minibatched validation table in the same form as the training
table.
Tasks:
* Convert the {{internal_keras_evaluate}} UDF to a UDA and perform weighted
averaging of loss and accuracy.
* Since x and y will now be minibatched, we don't need to add another
dimension to {{x and y}} np arrays in {{internal_keras_evaluate}}.
* Compare UDF to UDA and verify that the UDA results in a speed improvement
> DL: Support mini-batched validation data for fit/evaluate
> ---------------------------------------------------------
>
> Key: MADLIB-1332
> URL: https://issues.apache.org/jira/browse/MADLIB-1332
> Project: Apache MADlib
> Issue Type: Improvement
> Components: Deep Learning
> Reporter: Domino Valdano
> Priority: Major
> Fix For: v1.16
>
>
> Currently, {{keras_evaluate()}} is implemented by calling
> {{internal_keras_evaluate()}} as a UDF. This requires the validation table
> passed to {{keras_fit()}} to be in a format with only 1 image per row, even
> though the training table is in a different format, with a batch of images in
> every row. This is potentially confusing and cumbersome for users to deal
> with, and based on some preliminary testing it seems that passing only 1
> image at a time to keras_evaluate() is also slowing down performance.
> We can solve this by converting {{internal_keras_evaluate()}} into a UDA, so
> that it runs on a minibatched validation table in the same form as the
> training table.
>
> Tasks:
> * Convert the {{internal_keras_evaluate}} UDF to a UDA and perform weighted
> averaging of loss and accuracy.
> * Since x and y will now be minibatched, we don't need to add another
> dimension to {{x and y}} np arrays in {{internal_keras_evaluate}}.
> * Compare UDF to UDA and verify that the UDA results in a speed improvement
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)