[jira] [Commented] (MAHOUT-637) Remove direct HBase dependency

2011-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014284#comment-13014284
 ] 

Hudson commented on MAHOUT-637:
---

Integrated in Mahout-Quality #708 (See 
[https://hudson.apache.org/hudson/job/Mahout-Quality/708/])


> Remove direct HBase dependency
> --
>
> Key: MAHOUT-637
> URL: https://issues.apache.org/jira/browse/MAHOUT-637
> Project: Mahout
>  Issue Type: Improvement
>  Components: Classification
>Affects Versions: 0.4
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>  Labels: bayesian, gora, hbase
> Fix For: 0.5
>
> Attachments: MAHOUT-637.patch
>
>
> As discussed on the mailing list, seems desirable to remove the direct 
> dependence on HBase for now. The integration only exists for the Naive Bayes 
> Classifier, and is based on an old version. A more comprehensive strategy for 
> integrating with data sources, such as via Gora, is viewed as a desirable 
> goal for later. This is a step in that direction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-637) Remove direct HBase dependency

2011-03-29 Thread Dmitriy Lyubimov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012615#comment-13012615
 ] 

Dmitriy Lyubimov commented on MAHOUT-637:
-

Yes ok. ( It looked like you wanted to do more than just hbase for a moment 
here. )

> Remove direct HBase dependency
> --
>
> Key: MAHOUT-637
> URL: https://issues.apache.org/jira/browse/MAHOUT-637
> Project: Mahout
>  Issue Type: Improvement
>  Components: Classification
>Affects Versions: 0.4
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>  Labels: bayesian, gora, hbase
> Fix For: 0.5
>
> Attachments: MAHOUT-637.patch
>
>
> As discussed on the mailing list, seems desirable to remove the direct 
> dependence on HBase for now. The integration only exists for the Naive Bayes 
> Classifier, and is based on an old version. A more comprehensive strategy for 
> integrating with data sources, such as via Gora, is viewed as a desirable 
> goal for later. This is a step in that direction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-637) Remove direct HBase dependency

2011-03-29 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012587#comment-13012587
 ] 

Sean Owen commented on MAHOUT-637:
--

I don't intend to do any more work like this so shouldn't be a big conflict. 
Right now there's at best a 4-line conflict since I've outright deleted the 
HBase dependency. (And I think that was the consensus?)

> Remove direct HBase dependency
> --
>
> Key: MAHOUT-637
> URL: https://issues.apache.org/jira/browse/MAHOUT-637
> Project: Mahout
>  Issue Type: Improvement
>  Components: Classification
>Affects Versions: 0.4
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>  Labels: bayesian, gora, hbase
> Fix For: 0.5
>
> Attachments: MAHOUT-637.patch
>
>
> As discussed on the mailing list, seems desirable to remove the direct 
> dependence on HBase for now. The integration only exists for the Naive Bayes 
> Classifier, and is based on an old version. A more comprehensive strategy for 
> integrating with data sources, such as via Gora, is viewed as a desirable 
> goal for later. This is a step in that direction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-637) Remove direct HBase dependency

2011-03-29 Thread Dmitriy Lyubimov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012567#comment-13012567
 ] 

Dmitriy Lyubimov commented on MAHOUT-637:
-

I think it might make sense to unify dep work in one issue and have it shared 
in github with pull requests? otherwise we might end up entrenched in 
conflicted merges.

> Remove direct HBase dependency
> --
>
> Key: MAHOUT-637
> URL: https://issues.apache.org/jira/browse/MAHOUT-637
> Project: Mahout
>  Issue Type: Improvement
>  Components: Classification
>Affects Versions: 0.4
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>  Labels: bayesian, gora, hbase
> Fix For: 0.5
>
> Attachments: MAHOUT-637.patch
>
>
> As discussed on the mailing list, seems desirable to remove the direct 
> dependence on HBase for now. The integration only exists for the Naive Bayes 
> Classifier, and is based on an old version. A more comprehensive strategy for 
> integrating with data sources, such as via Gora, is viewed as a desirable 
> goal for later. This is a step in that direction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-637) Remove direct HBase dependency

2011-03-29 Thread Dmitriy Lyubimov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012556#comment-13012556
 ] 

Dmitriy Lyubimov commented on MAHOUT-637:
-

as of Mahout-622, it (hbase dep) is declared optional now. I.e. whoever tries 
to embed Mahout in their build, at least would not find it there and whatever 
algorithm is using it would fail with UnsatisfiedLinkException unless they 
re-declare it explicitly (optionally having an opportunity to override version 
as well).

> Remove direct HBase dependency
> --
>
> Key: MAHOUT-637
> URL: https://issues.apache.org/jira/browse/MAHOUT-637
> Project: Mahout
>  Issue Type: Improvement
>  Components: Classification
>Affects Versions: 0.4
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>  Labels: bayesian, gora, hbase
> Fix For: 0.5
>
> Attachments: MAHOUT-637.patch
>
>
> As discussed on the mailing list, seems desirable to remove the direct 
> dependence on HBase for now. The integration only exists for the Naive Bayes 
> Classifier, and is based on an old version. A more comprehensive strategy for 
> integrating with data sources, such as via Gora, is viewed as a desirable 
> goal for later. This is a step in that direction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-637) Remove direct HBase dependency

2011-03-29 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012537#comment-13012537
 ] 

Sean Owen commented on MAHOUT-637:
--

Charset.forName() won't throw a (checked) exception versus some other similar 
older methods. But yeah the Guava helper class there is shorter and avoids 
typos ("UFT-8"), and if it's going to throw an error will do so much earlier. I 
think this import/dependency earns its keep.

I can easily search-and-replace for this now or later.

> Remove direct HBase dependency
> --
>
> Key: MAHOUT-637
> URL: https://issues.apache.org/jira/browse/MAHOUT-637
> Project: Mahout
>  Issue Type: Improvement
>  Components: Classification
>Affects Versions: 0.4
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>  Labels: bayesian, gora, hbase
> Fix For: 0.5
>
> Attachments: MAHOUT-637.patch
>
>
> As discussed on the mailing list, seems desirable to remove the direct 
> dependence on HBase for now. The integration only exists for the Naive Bayes 
> Classifier, and is based on an old version. A more comprehensive strategy for 
> integrating with data sources, such as via Gora, is viewed as a desirable 
> goal for later. This is a step in that direction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-637) Remove direct HBase dependency

2011-03-29 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012529#comment-13012529
 ] 

Ted Dunning commented on MAHOUT-637:


Regarding lines like this:
{code}
items.add(new String(data, Charset.forName("UTF-8")));
{code}
We have a dependency on guava, I think.  It would allow
{code}
items.add(new String(data, Charsets.UTF8));
{code}
The benefit is that no exception has to be propagated or caught.

> Remove direct HBase dependency
> --
>
> Key: MAHOUT-637
> URL: https://issues.apache.org/jira/browse/MAHOUT-637
> Project: Mahout
>  Issue Type: Improvement
>  Components: Classification
>Affects Versions: 0.4
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>  Labels: bayesian, gora, hbase
> Fix For: 0.5
>
> Attachments: MAHOUT-637.patch
>
>
> As discussed on the mailing list, seems desirable to remove the direct 
> dependence on HBase for now. The integration only exists for the Naive Bayes 
> Classifier, and is based on an old version. A more comprehensive strategy for 
> integrating with data sources, such as via Gora, is viewed as a desirable 
> goal for later. This is a step in that direction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-637) Remove direct HBase dependency

2011-03-29 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012503#comment-13012503
 ] 

Sean Owen commented on MAHOUT-637:
--

PS, we still have references to KosmoFS, but I see no usages of it in the code. 
I'll try removing that too.

> Remove direct HBase dependency
> --
>
> Key: MAHOUT-637
> URL: https://issues.apache.org/jira/browse/MAHOUT-637
> Project: Mahout
>  Issue Type: Improvement
>  Components: Classification
>Affects Versions: 0.4
>Reporter: Sean Owen
>Assignee: Robin Anil
>Priority: Minor
>  Labels: bayesian, gora, hbase
> Fix For: 0.5
>
>
> As discussed on the mailing list, seems desirable to remove the direct 
> dependence on HBase for now. The integration only exists for the Naive Bayes 
> Classifier, and is based on an old version. A more comprehensive strategy for 
> integrating with data sources, such as via Gora, is viewed as a desirable 
> goal for later. This is a step in that direction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira