At the moment, only committers can change the website unfortunately. If
you have a text to add, I'm happy to work it in and add your name to our
contributers list in the CHANGELOG.
Best,
Sebastian
On 03/05/2014 04:58 PM, Scott C. Cote wrote:
I had recently taken the text tour of mahout, but
I mean "balance the risk aversion against the value of new features" duh.
On Wed, Mar 5, 2014 at 1:39 PM, Andrew Musselman wrote:
> Yeah, for sure; balancing clients' risk aversion to technical features is
> why we often recommend vendor solutions.
>
> Having a little button to choose a newer v
Yeah, for sure; balancing clients' risk aversion to technical features is
why we often recommend vendor solutions.
Having a little button to choose a newer version of a component in the
Manager UI (even with a confirmation dialog that said "Are you sure? Are
you crazy?") would be more palatable to
You can always install whatever version of anything on your cluster
that you want. It may or may not work, but often happens to, at least
for whatever you need it to do.
It's just the same as it is without a packaged distribution -- dump
new tarballs and cross your fingers. Nothing is weird or dif
Feels like just yesterday :)
Consider this a feature request to have more flexible component versioning,
even with a caveat/"here be dragons" warning. I know that complicates
things but people do use your releases a long time. I personally wished I
could upgrade Pig on CDH 4 for new features but
I don't understand this -- CDH always bundles the latest release.
You know that CDH4 was released in July 2012, right? So it included
0.7 + patches. CDH5 includes 0.8 because 0.9 was released about a
month after it began beta 2.
CDH follows semantic versioning and won't introduce changes that are
I also prefer design 2
On Wed, Mar 5, 2014 at 11:08 AM, Frank Scholten wrote:
> +1 for design 2
>
>
> On Wed, Mar 5, 2014 at 6:00 PM, Suneel Marthi >wrote:
>
> > +1 for Option# 2.
> >
> >
> >
> >
> >
> > On Wednesday, March 5, 2014 7:11 AM, Sebastian Schelter
> > wrote:
> >
> > Hi everyone,
>
+1 for design 2
On Wed, Mar 5, 2014 at 6:00 PM, Suneel Marthi wrote:
> +1 for Option# 2.
>
>
>
>
>
> On Wednesday, March 5, 2014 7:11 AM, Sebastian Schelter
> wrote:
>
> Hi everyone,
>
> In our latest discussion, I argued that the lack (and errors) of
> documentation on our website is one of th
On Wed, Mar 5, 2014 at 9:08 AM, Sean Owen wrote:
> I don't follow what here makes you say they are "cut down" releases?
>
meaning it seems to be pretty much 2 releases behind the official. But i
definitely don't follow CDH developments in this department, you seem in a
better position to explain
I apologize Sean I wasn't aware of the complete history in this thread. I
didn't know about Hadoop 2.x being involved here, if so yes need to build
Mahout against HEAD with Hadoop 2 profile to get working.
On Wednesday, March 5, 2014 12:04 PM, Sean Owen wrote:
CDH 4.5 and 4.6 are both
For SVD based algorithms, you would should use the AllUnknownItems
Strategy then, thats correct.
In the majority of industry usecases that I have seen, people use
pre-computed item similarities (Mahout has lots of machinery for doing
this, btw), so AllSimilarItems totally makes sense there.
I don't follow what here makes you say they are "cut down" releases?
They are release plus patches not release minus patches.
The question is not about how to use 0.7, but how to use 1.0-SNAPSHOT.
Why would switching to the "official" 0.7 release help?
I think the answer is "you build Mahout for
Yeah. it would seem CDH releases of Mahout produce some sort of cut-down
version of such. I suggest to switch to official release tarbal (or write
to Cloudera support about it).
On Wed, Mar 5, 2014 at 8:38 AM, Andrew Musselman wrote:
> I'm not sure about this either but I think these are all th
CDH 4.5 and 4.6 are both 0.7 + patches. Neither contains 0.8, since it
has (tiny) breaking changes vs 0.7 and this is a minor version update.
CDH5 contains 0.8 + patches. I did not say CDH4 has 0.8 -- re-read the
message of mine that was quoted.
http://archive.cloudera.com/cdh4/cdh/4/mahout-0.7-cd
It can even make things worse in SVD-based algorithms for which
preference estimation is very fast.
On Wed, Mar 5, 2014 at 7:00 PM, Tevfik Aytekin wrote:
> Hi Sebastian,
> But in order not to select items that is not similar to at least one
> of the items the user interacted with you have to comp
I agree. IMHO using the Mahout recommenders is wrong for this. The recommenders
are the CF/cooccurrence type that expect usage or rating data on fairly long
lived items from a somewhat static catalog. Trying to make them work for
content based recommendations is needlessly difficult especially s
Hi Sebastian,
But in order not to select items that is not similar to at least one
of the items the user interacted with you have to compute the
similarity with all user items (which is the main task for estimating
the preference of an item in item-based method). So, it seems to me
that AllSimilarI
+1 for Option# 2.
On Wednesday, March 5, 2014 7:11 AM, Sebastian Schelter wrote:
Hi everyone,
In our latest discussion, I argued that the lack (and errors) of
documentation on our website is one of the main pain points of Mahout
atm. To be honest, I'm also not very happy with the design,
> So both strategies seems to be effectively the same, I don't know what
> the implementers had in mind when designing
> AllSimilarItemsCandidateItemsStrategy.
It can take a long time to estimate preferences for all items a user
doesn't know. Especially if you have a lot of items. Traditional
i
I'm not sure about this either but I think these are all the changes to
Mahout in CDH 4.6.0:
http://archive.cloudera.com/cdh4/cdh/4/mahout-0.7-cdh4.6.0.CHANGES.txt
MAHOUT-1291
MAHOUT-1033
MAHOUT-1142
On Wed, Mar 5, 2014 at 8:30 AM, Suneel Marthi wrote:
> Not sure if the CDH4 patches on top o
If the similarity between item 5 and two of the items user 1 preferred are not
NaN then it will return 1, that is what I'm saying. If the
similarities were all NaN then
it will not return it.
But surely, you might wonder if all similarities between an item and
user's items are NaN, then
AllUnknown
Not sure if the CDH4 patches on top of 0.7 has fixes for M-1067 and M-1098
which address the issues u r seeing.
The second part of the issue u r seeing with Mahout 0.9 distro seems to be
related to how u set it up on CDH4. I apologize for not being helpful here as I
am not a CDH4 user or expe
@Tevfik, running this recommender:
GenericItemBasedRecommender itemRecommender = new
GenericItemBasedRecommender(dataModel, itemSimilarity, new
AllSimilarItemsCandidateItemsStrategy(itemSimilarity), new
AllSimilarItemsCandidateItemsStrategy(itemSimilarity));
With this dataModel:
1,1,1.0
1,2,2.0
@Pat. You described my situation very well. The only additional thing is
that I am also interested in creating some sort of a profile from the user
with all the information s/he has provided by interacting with the articles
and not only recommending similar items (news) based on a specific input.
T
I had recently taken the text tour of mahout, but I couldn't decipher a
way to contribute updates to the tour (some of the file names have
changed, etc).
How would I start? (this was part of my offer to help with the
documentation of Mahout).
SCott
On 3/5/14 9:47 AM, "Pat Ferrel" wrote:
>Wha
On Wed, Mar 5, 2014 at 7:47 AM, Pat Ferrel wrote:
> What no centered text??
>
> ;-)
>
> Love either.
>
> BTW users are no longer able to contribute content to the wiki. Most CMSs
> have a way to allow input that is moderated. Might this make getting
> documentation help easier? Allow anyone t
What no centered text??
;-)
Love either.
BTW users are no longer able to contribute content to the wiki. Most CMSs have
a way to allow input that is moderated. Might this make getting documentation
help easier? Allow anyone to contribute but committers can filter out the
bad—sort of like
I am ignoring the rest of the thread because I suspect it may have gotten off
track.
Your data is new articles, right? You would like to recommend from known
articles to any user based on an article they rate or even view. You have no
collaborative filtering data because the lifetime of a news
Previous mail sent only to Suneel : (my bad sorry)
According to my stacktrace it seems that I am running mahout 0.7 indeed.
> That's the version provided by Cloudera when I install mahout using yum.
> But according to Sean Owen, it really is a 0.8 inside...
> Anyway I tried with the compiled versi
Both are nice.
I think you are right that the second is calmer.
On Wed, Mar 5, 2014 at 4:11 AM, Sebastian Schelter wrote:
> Hi everyone,
>
> In our latest discussion, I argued that the lack (and errors) of
> documentation on our website is one of the main pain points of Mahout atm.
> To be hon
Juan,
You got me wrong,
AllSimilarItemsCandidateItemsStrategy
returns all items that have not been rated by the user and the
similarity metric returns a non-NaN similarity value with at
least one of the items preferred by the user.
So, it does not simply return all items that have not been rated
Are u using Mahout 0.7 ?
From this line in ur stacktrace that seems to be the case:
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.5.0-job.jar
You could build Mahout outside of CDH from Mahout trunk and put the jars onto
CDH5.
I am no Cloudera expert or CDH5 user to help with CDHx build.
Hi Tefik,
Thanks for the response. I think what you says contradicts what Sebastian
pointed out before. Also, if AllSimilarItemsCandidateItemsStrategy returns
all items that have not been rated by the user, what would
AllUnknownItemsCandidateItemsStrategy return?
On Wed, Mar 5, 2014 at 1:40 PM,
Hi and thanks for your help!
I had been told that the version of mahout used by Cloudera (CDH 4.6) was
in fact 0.8 with a patch for mr2 support.
(
http://mail-archives.apache.org/mod_mbox/mahout-user/201402.mbox/%3CCAEccTywqSAKA_HeX4vTZ-5XPmKtj5b8zMGQUfn5qRsiq=7o=u...@mail.gmail.com%3E)
But I tri
Sorry there was a typo in the previous paragraph.
If I remember correctly, AllSimilarItemsCandidateItemsStrategy
returns all items that have not been rated by the user and the
similarity metric returns a non-NaN similarity value with at
least one of the items preferred by the user.
On Wed, Mar 5
Hi Juan,
If I remember correctly, AllSimilarItemsCandidateItemsStrategy
returns all items that have not been rated by the user and the
similarity metric returns a non-NaN similarity value that is with at
least one of the items preferred by the user.
Tevfik
On Wed, Mar 5, 2014 at 2:30 PM, Sebast
I liked both of them
Great work Lucas!
Gokhan
On Wed, Mar 5, 2014 at 2:11 PM, Sebastian Schelter wrote:
> Hi everyone,
>
> In our latest discussion, I argued that the lack (and errors) of
> documentation on our website is one of the main pain points of Mahout atm.
> To be honest, I'm also not
On 03/05/2014 01:23 PM, Juan José Ramos wrote:
Thanks for the reply, Sebastian.
I am not sure if that should be implemented in the Abstract base class
though because for
instance PreferredItemsNeighborhoodCandidateItemsStrategy, by definition,
it returns the item not rated by the user and rated
Thanks for the reply, Sebastian.
I am not sure if that should be implemented in the Abstract base class
though because for
instance PreferredItemsNeighborhoodCandidateItemsStrategy, by definition,
it returns the item not rated by the user and rated by somebody else.
Back to my last post, I have b
Hi everyone,
In our latest discussion, I argued that the lack (and errors) of
documentation on our website is one of the main pain points of Mahout
atm. To be honest, I'm also not very happy with the design, especially
fonts and spacing make it super hard to read long articles. This also
prev
Hi Juan,
that is a good catch. CandidateItemsStrategy is the right place to
implement this. Maybe we should simply extend its interface to add a
parameter that says whether to keep or remove the current users items?
We could even do this in the abstract base class then.
--sebastian
On 03/05
In case somebody runs into the same situation, the key seems to be in the
CandidateItemStrategy being passed to the constructor
of GenericItemBasedRecommender. Looking into the code, if no
CandidateItemStrategy is specified in the
constructor, PreferredItemsNeighborhoodCandidateItemsStrategy is use
Hi
Here are my actions and the problematic result again:
[hduser@vm38 ~]$ git clone https://github.com/apache/mahout.git
remote: Reusing existing pack: 76099, done.
remote: Counting objects: 39, done.
remote: Compressing objects: 100% (32/32), done.
remote: Total 76138 (delta 2), reused 0 (delta
43 matches
Mail list logo