/mahout-integration/RecommenderServlet?userID=1debug=true
Sean, do you have any link which specifies correct steps to run
recommendation demo?
Thanks,
Yugang
On Tue, Jan 24, 2012 at 2:31 PM, Sean Owen sro...@gmail.com wrote:
Backing up a sec -- I looked, and there is not a circular
You're talking about recommendations now... are we talking about a
clustering, classification or recommender system?
In general I don't know if it makes sense for business users to be
deciding aspects of the internal model. At most someone should input
the tradeoffs -- how important is accuracy
You are making recommendations, and you want to do this via
clustering. OK, that's fine. How you implement it isn't so important
-- it's that you have some parameters to change and want to know how
any given process does.
You just want to use some standard recommender metrics, to start, I'd
I think you would cluster these like any other text document. The
centroid of each cluster tells you where the cluster is in
feature-space, but the features are just words. If you find the
features (words) with largest absolute value, those ought to be the
words that appear frequently in the
In theory this is what the system is learning for you, that there is some
pattern to the preferences and so someone who likes Julia Roberts's movies
would tend to be recommended more of them.
So I suppose I'd advise against making a pseudo-item out of a feature
unless you have specific, new
I don't know if there's any particular preferred format. I think you'd
generally cite the web site, and follow any standard citation format for
that.
Sean
On Sun, Apr 8, 2012 at 8:24 PM, Ahmed Abdeen Hamed
ahmed.elma...@gmail.comwrote:
Hello,
Is there a specific format the Mahout developers
(Hmm, I don't know why it doesn't post to the mailing list. We get a
message about moderating everything. I'll copy it to the list now.)
In #1, you describe the usual user-item preference matrix. Yes it's
sparse. I guess you could make up pseudo-items like genre in the
matrix, yes, if you had
This means you have an incompatible version of Lucene in your app at
runtime. Use the same one Mahout uses.
On Fri, Apr 6, 2012 at 2:21 PM, Tristan Slominski
tristan.slomin...@gmail.com wrote:
Hello group,
I managed to get Mahout running.. awesome! But I keep on running into
issues that break
It's coming from somewhere else then. I think you'd want to examine
the rest of the classpaths. You do not need to put Lucene jars in the
classpath yourself. It will just cause issues. You will need to make
sure you're looking at the remote cluster's classpath, if it's remote.
On Fri, Apr 6,
It might or might not be interesting to comment on this discussion in
light of the new product/project I mentioned last night, Myrrix.
It's definitely an example of precisely this two-layered architecture
we've been discussing on this thread. http://myrrix.com/design/
The nice thing about a
I would recommend you use (only) the ad data. These are boolean data
points in the recommender engine speak. You can 'recommend' ads this
way.
I understand your question is a bit more than that. First you want to
use the *not*-clicked data. My first question is, is this meaningful?
I am served
also cancel mahout sub
tasks.
Do you think it could work that way?
On 02/04/12 19:05, Sean Owen wrote:
You can use the Hadoop interface itself (like, the command-line hadoop
tool) to kill a job by its ID. If you kill one MapReduce job the
entire process should halt after that.
On Mon
Dear all -- I've long promised (threatened?) to begin efforts to
commercialize Apache Mahout. Given my line of work in VC, I see
evidence for positive symbiosis between open source and commercial
enterprise. We have evidence from the growth in user base and mailing
list, as well as the Mahout in
It's the same formula, what do you think is different?
On Wed, Apr 4, 2012 at 9:35 PM, ziad kamel ziad.kame...@gmail.com wrote:
Hi , I checked the code for NDCG and it seems not same as
http://en.wikipedia.org/wiki/Discounted_cumulative_gain
How that formula was derived ?
Thanks
;
}
Second thing is that it have a relevance number rel which the
formula don't use.
On Wed, Apr 4, 2012 at 3:54 PM, Sean Owen sro...@gmail.com wrote:
It's the same formula, what do you think is different?
On Wed, Apr 4, 2012 at 9:35 PM, ziad kamel ziad.kame...@gmail.com wrote:
Hi , I checked
No, re-read my last message. The ordering matters, since the discount
changes at each position.
On Wed, Apr 4, 2012 at 10:36 PM, ziad kamel ziad.kame...@gmail.com wrote:
It seems that having a recommended list that is for example
9, 23, 8
or
8 , 9 , 23
will give same NDGC , since it just
Not with the Apache license... it's not copyleft. The GNU license
might require this.
On Wed, Apr 4, 2012 at 11:43 PM, Darren Govoni dar...@ontrenet.com wrote:
The short answer is that they have to open their source. So anything
they do to the original code is readily available to all.
This is lightly covered in Mahout in Action but yes there is really little
more to know. You upload the job jar and run it like anything else in AWS.
On Apr 3, 2012 10:24 AM, Sebastian Schelter s...@apache.org wrote:
None that I'm aware of. But its supereasy to use Mahout in EMR: You need
to
You can use the Hadoop interface itself (like, the command-line hadoop
tool) to kill a job by its ID. If you kill one MapReduce job the
entire process should halt after that.
On Mon, Apr 2, 2012 at 6:44 PM, Sören Brunk soren.br...@deri.org wrote:
Hi,
I'm using the distributed RecommenderJob
(Why not read the code first? We kinda reserve the mailing list for more
specific questions from after you've tried the basics.)
On Sun, Apr 1, 2012 at 3:13 PM, ziad kamel ziad.kame...@gmail.com wrote:
Do Mahout compute the similarity between every pair of users to
determine their
You don't want to do this. Similarity only makes sense if it's symmetric.
Instead, you probably want to weight at the point that the similarity is
used. Compute it normally, then weight depending on which item is what.
On Fri, Mar 30, 2012 at 8:02 AM, tianwild tianwild...@hotmail.com wrote:
Hi
L Shaw jls...@uw.edu wrote:
Suggestion, indeed. I passed that option, but still only 2 mappers
were
created.
On Thu, Mar 29, 2012 at 5:23 PM, Sean Owen sro...@gmail.com wrote:
Hadoop is what chooses the number of mappers, and it bases it on
input
size. Generally
I think the real cause is perhaps that the implementation is not fully
fleshed out. I haven't looked at it, but I'm sure that if you find
additions and improvements you could post them and get them committed.
I am probably missing something basic, but you seemed to say at the outset
that you
Nope it's the sum of the absolute values of differences in ratings, for
your purposes.
On Thu, Mar 29, 2012 at 7:29 PM, ziad kamel ziad.kame...@gmail.com wrote:
City block distance or Manhattan distance
Wikipedia define it for points as
http://en.wikipedia.org/wiki/Taxicab_geometry
So how
Like I think we've said, it depends on your data. I expect that some
similarity metrics will work better than others. Why is hard to say
without knowing anything about your data.
I don't understand your previous question about representation. I just gave
you the definition of city-block distance.
Hadoop is what chooses the number of mappers, and it bases it on input
size. Generally it will not assign less than one worker per chunk and a
chunk is usually 64MB (still, I believe). You can override this directly
(well, at least, register a suggestion to Hadoop). I would tell you the
exact flag
What top items? I am not sure what you're referring to here, but, no I do
not expect things to be identical when changing metrics in general. I've
already answered your other question.
On Thu, Mar 29, 2012 at 10:52 PM, ziad kamel ziad.kame...@gmail.com wrote:
OK, things become more clear .
(If you're using a modern version of Hadoop, the flag is something
different, so make sure you check what the real value is.)
There's another option concerning minimum split size that you could reduce
from its default too.
On Thu, Mar 29, 2012 at 11:05 PM, Jason L Shaw jls...@uw.edu wrote:
There is not necessarily a relation, but, a good recommender ought to be
good at predicting ratings, and ought to return good recommendations. So
yes you would generally expect a low error when you get a high precision,
but there is not a direct connection.
On Wed, Mar 28, 2012 at 5:52 PM, ziad
It pretends that any non-existent preference actually exists and is equal
to the user's average preference. It is only done for purposes of computing
similarity. It does not actually set a value in the model.
On Wed, Mar 28, 2012 at 10:45 PM, ziad kamel ziad.kame...@gmail.com wrote:
Hi ,
It
Make sure you use the latest MySQL driver. Are you sure that one of
your columns is not NULL?
On Tue, Mar 27, 2012 at 6:42 AM, 344911009 mudaom...@vip.qq.com wrote:
windows XP 2G
runing MovieSite ,use mysql-connector-java-5.1.13-bin.jar, userID and
movieID are INTEGER ,but still has errors
Any similarity metric works to the extent that its assumptions match
the data's reality. Pearson's key assumption is that ratings scale
proportionally with our degree of like or dislike for a thing. That is
only sort of how people rate things. Really a 1 or 2 (on a scale of 5)
means I am sort of
I'm sure he's referring to the off-line model-building bit, not an online
component.
On Mon, Mar 26, 2012 at 9:27 AM, Razon, Oren oren.ra...@intel.com wrote:
By saying: At Veoh, we built our models from several billion interactions
on a tiny cluster you meant that you used the distributed
necessarily need
to load the entire intermediate file (similarity results) into the memory?!
-Original Message-
From: Sean Owen [mailto:sro...@gmail.com]
Sent: Monday, March 26, 2012 11:48
To: user@mahout.apache.org
Subject: Re: Mahout beginner questions...
I'm sure he's referring
An SQL database doesn't have much role to play in this kind of system,
and that's no criticism of RDBMSes.
The algorithms operate on very simple, nearly unstructured data and
are essentially read-only. So the complexity of keys and transactions
is just overhead. The simple, non-distributed
Can it be implemented? sure, but what you see is what is available. If
you want a different clustering approach you would have to implement
it. The algorithm there is not k-means.
On Mon, Mar 26, 2012 at 8:49 PM, Ahmed Abdeen Hamed
ahmed.elma...@gmail.com wrote:
Hello,
This might sound trivial
This is no useful detail at all. What algorithm are you even running??
On Mon, Mar 26, 2012 at 11:29 PM, ziad kamel ziad.kame...@gmail.com wrote:
Dear developers ,
I run some recommendations on mahout of 32 and 64 bit machines (Ubuntu) . I
found out that on 32 bit I am getting higher
Au contraire, you can do exactly this with an IDRescorer. Divide by (the
log of) and item's occurrences for example to penalize popular items.
I don't recommend this. Stuff like the log-likelihood metric is already in
a sense accounting for things that are just generally popular and
normalizing
but a good way to boost up speed could be to use
caching recommender, meaning computing the recommendations in advanced
(refresh it every X min\hours) and always recommend using the most updated
recommendations, right?!
-Original Message-
From: Sean Owen [mailto:sro...@gmail.com]
Sent: Sunday
me if I'm wrong but a good way to boost up speed could be to use
caching recommender, meaning computing the recommendations in advanced
(refresh it every X min\hours) and always recommend using the most
updated
recommendations, right?!
-Original Message-
From: Sean Owen
Why are you posting to Mahout lists, 3 times, if you are asking about
Hadoop? Etiquette foul.
On Mar 24, 2012 10:41 AM, Bahadır Yılmaz bahadiryi...@gmail.com wrote:
Hi everyone,
i have a problem with HadoopUtil.overwriteOutput(**outPath).In intellij
idea,i am using maven project and
Define significant?
On Sat, Mar 24, 2012 at 1:38 PM, ziad kamel ziad.kame...@gmail.com wrote:
Dear developers,
How can I know that the recommendations I get from Mahout is significant ?
Is there a way to know that there is serendipity in recommending using
certain recommender than other ?
1. These are the JDBC-related classes. For example see
MySQLJDBCDiffStorage or MySQLJDBCDataModel in integration/
2. The distributed and non-distributed code are quite separate. At
this scale I don't think you can use the non-distributed code to a
meaningful degree. For example you could
will be to use model based recommenders.
Saying this, I wonder why there is such few model based recommenders,
especially considering the fact that Mahout contain several data mining
models implemented already?
-Original Message-
From: Sean Owen [mailto:sro...@gmail.com]
Sent: Thursday
That pretty much means what it says = delete temp.
On Thu, Mar 22, 2012 at 6:06 PM, jeanbabyxu jessica...@aexp.com wrote:
Thanks so much tianwild for pointing out the typo. Now it's running but I got
a different error msg:
Exception in thread main
Yes. This prevents accidental overwrite, and mimics how Hadoop/HDFS
generally act.
On Thu, Mar 22, 2012 at 6:58 PM, jeanbabyxu jessica...@aexp.com wrote:
I was able to manually clear out the output directory by using
bin/hadoop dfs -rmr output.
But do we have to remove all content in the
It is wherever you compiled your own classes -- it's up to you.
SIMILARITY_EUCLEDEAN_DISTANCE is not a class.
You should use 0.6 anyway. While you may find you have to make minor
modifications if following the book, it's 99% compatible.
On Thu, Mar 22, 2012 at 8:07 PM, jeanbabyxu
What do you mean that you have a user-item association from a
log-likelihood metric?
Combining two values is easy in the sense that you can average them or
something, but only if they are in the same units. Log likelihood
may be viewed as a probability. The distance function you derive from
it --
now.
Thanks very much,
-Ahmed
On Thu, Mar 22, 2012 at 5:26 PM, Sean Owen sro...@gmail.com wrote:
What do you mean that you have a user-item association from a
log-likelihood metric?
Combining two values is easy in the sense that you can average them or
something, but only
Yes, but you can't use it as both things at once. I meant that you
swap them at the broadest level -- at your original input. So all
items are really users and vice versa. At the least you need two
separate implementations, encapsulating two different notions of
similarity.
Similarity is
It's -Dmapred.output.dir=output not --Dmapred.output.dir=output (one dash),
but, that's not even the problem.
I don't think you can specify -D options this way, as they are JVM
arguments. You need to configure these in Hadoop's config files.
This is not specific to Mahout.
On Wed, Mar 21, 2012 at
If you don't need Hadoop then this is pretty simple. You can just write a
nested loop that computes all pairs off an ItemSimilarity implementation.
If I recall rightly GenericItemSimilarity will do that for you off an
existing ItemSimilarity and then has the results in memory as a new
No there is not such support right now.
The most useful piece of code would be a DataModel implementation that
combines the data in several other DataModels. That would easily let you
read from several databases.
The hard part there is merging data sets (what if two DBs have data for one
No I don't think that really comes into play in any of the ML algorithms
here. At least I do not recall seeing it.
On Mon, Mar 19, 2012 at 3:44 PM, Ahmed Abdeen Hamed ahmed.elma...@gmail.com
wrote:
Hello,
Does Mahout have support for Edit Distance between two Strings? I looked on
the web
Yep it's all in memory -- it would be too slow to access it out of Mongo.
The purpose is just making it easy to read and re-read data into Mongo, and
facilitate updates.
If the data is too big to fit in memory you should look first at pruning
your data -- can sampling 10% of it still give you
What do you mean by indexed here?
On Sat, Mar 17, 2012 at 10:56 PM, Pat Ferrel p...@occamsmachete.com wrote:
I need to digest some mahout files and merge them into a MongoDB database.
Since digesting would be a lot easier if the mahout keys were indexed I
wonder if a seqdumper --format json
You shouldn't have to add anything to your jar, if you use the
supplied 'job' file which contains all transitive dependencies.
If you do add your own jars, I think you need to unpack and repack
them, not put them into the overall jar as a jar file, even with a
MANIFEST.MF entry. I am not sure that
that only the clustering and classification parts of mahout are really
able to be distributed on a hadoop cluster.
2012/3/15 Sean Owen sro...@gmail.com
You shouldn't have to add anything to your jar, if you use the
supplied 'job' file which contains all transitive dependencies.
If you do add
is: is there a way to compute these similarities offline?
Thanks very much,
-Ahmed
On Tue, Mar 6, 2012 at 5:14 PM, Sean Owen sro...@gmail.com wrote:
Sure, you just write your own ItemSimilarity implementation based on
the content, whatever that may be. what you do there is mostly up to
you; there's
Before I answer, I want to make sure we're on the same page. You are
definitely describing a search problem. Was my guess at how you are
also adding in something recommender-related accurate?
Otherwise we may be talking past each other again.
On Tue, Mar 13, 2012 at 5:35 PM, Ahmed Abdeen Hamed
OK, you have some users. You have some items, and those items have attributes.
Nothing here connects users to items though, so how can any process
estimate any additional user-item connections?
You could compute item-item similarities, but that doesn't resolve this.
Sorry I am really confused
genre,
director,
actor, and year of release. Using such an implementation within a
traditional item
This is the part that I am trying to understand and have a solution for.
Thanks,
-Ahmed
On Tue, Mar 13, 2012 at 2:08 PM, Sean Owen sro...@gmail.com wrote:
OK, you have some users
Yes it's item-based only. --similarityClassname chooses the metric but
it is item-based.
On Tue, Mar 13, 2012 at 11:53 PM, Rich cchuang...@gmail.com wrote:
Hi,
I have been digging into Mahout on Hadoop for the pas few days.
I was wondering the recommendation
algorithm that is used in
You can implement your own custom ItemSimilarity that computes this
metric, or anything else you can imagine. In fact there is already a
bit of API in DataModel for storing and retrieving timestamps too, so
this should be easy.
It's probably a bit easier said than done given the exact logic
Sure -- to do this, you simply flip your items and users. Feed item
IDs as user IDs and vice versa. Then you have a system that recommends
users to items, really. And you can use clustering if you like, to do
that. In fact you can use any algorithm.
Sean
On Mon, Mar 12, 2012 at 1:56 PM, Ahmed
Similarity computations need to be very fast. I don't know if you can
pre-compute them since they're time-dependent and I assume need to use
up-to-the-second information.
You'll need to store something in memory to make this fast enough.
That can make scale a problem, but, I am also guessing you
OK if that's the case, put the pre-computed values in a
GenericItemSimilarity and you're done.
Hadoop most certainly does not help you compute anything 'on the fly'.
It might help you precompute. Don't worry about distribution until
you're sure you have a big scale problem, and that usually takes
(It's out there as TanimotoCoefficientSimilarity -- not named
JaccardSimilarity or anything.)
On Mon, Mar 12, 2012 at 10:59 PM, Ted Dunning ted.dunn...@gmail.com wrote:
I would generally recommend using the LLR similarity.
But if you have an itch, scratch it. I do think we have a tanimoto
This isn't a recommender problem -- it's simpler. It sounds like you
just want to count the most frequently occurring items, and pairs of
items. That's just a question of counting.
On Sun, Mar 11, 2012 at 12:32 PM, mahout user mahoutu...@gmail.com wrote:
Hello group,
I am new to mahout..I am
No, it's so easy you can do it in about 20 lines of code so I don't
think it really warrants a software component.
On Sun, Mar 11, 2012 at 12:39 PM, mahout user mahoutu...@gmail.com wrote:
Thanks Sean Owen,
is it any class available with mahout for doing this stuff?
If by #3 you mean you have preferences for many users, this is of
course the standard input for a recommender, yes. If you also have
some user-user similarity info beyond that, you can implement
UserSimliarity and use GenericUserBasedRecommender to incorporate
that.
If you want to boost items
, 2012 at 6:25 PM, Sean Owen sro...@gmail.com wrote:
If by #3 you mean you have preferences for many users, this is of
course the standard input for a recommender, yes. If you also have
some user-user similarity info beyond that, you can implement
UserSimliarity and use GenericUserBasedRecommender
Recommender implementation which
blends both item-based and user-based recommendations?
On Sat, Mar 10, 2012 at 9:06 PM, Sean Owen sro...@gmail.com wrote:
It really depends on what you mean by based on time, as it could
mean many things. I'm assuming you mean that an item's seasonality
should
, 2012 at 9:38 PM, Sean Owen sro...@gmail.com wrote:
It sounds like you have substantially a search problem. You know the
user's attributes, you know the items' attributes, and are just
finding the closest match. That by itself doesn't need a recommender
at all; it would just be extra complexity
This means you are running on a headless machine without a monitor. The
program needs to show a window with graphics but cant.
On Mar 9, 2012 6:48 AM, rahul raghavendhra rahulraghavendh...@gmail.com
wrote:
hi Lance,
i tried as u said, but now i got a new exception
Exception in thread main
In this case, the code in question is the non-distributed code rather
than Hadoop. But yes I agree it will make a perhaps bigger difference
on Hadoop. All of the Hadoop stuff uses integer keys.
On Fri, Mar 9, 2012 at 2:10 AM, Paritosh Ranjan pran...@xebia.com wrote:
Are these identifiers used as
I don't expect they are different in speed. Both do about exactly the
same thing and finish with a simple computation.
On Thu, Mar 8, 2012 at 9:52 AM, Ayad Al-Qershi alqer...@gmail.com wrote:
Dear All,
can anyone tell me why running the recommender job with log-likelihood
similarity performs
No. It used to work this way, but was removed just because you get
much better memory and performance using longs. It would be a lot of
surgery to undo this.
The best answer is to use longs. If you must use strings, IDMigrator
does the trick quite well.
On Thu, Mar 8, 2012 at 1:27 PM, Claudia
Yes this doesn't exist as a push-button solution anymore. There is no
target that builds a .war. However it's pretty easy to resurrect the
script from 0.5, or, simply configure your IDE to build a .war with
the Mahout .jar, your .jar, and a one-liner web.xml that configures
RecommenderServlet.
The client can override cluster defaults unless the cluster marks them final.
On Wed, Mar 7, 2012 at 9:02 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:
Aren't hadoop site.xml settings on the driver's client usually
overshadow whatever it is on the cluster? Or you don't have the privs
to change
DistributedRowMatrix operates on IntWritable,VectorWritable in a
sequence file, and it looks like you're feeding text. No, it doesn't
accept some text-based format.
On Wed, Mar 7, 2012 at 8:41 PM, PEDRO MANUEL JIMENEZ RODRIGUEZ
pmjimenez1...@hotmail.com wrote:
Sorry but I can't understand how
RecommenderService.jws is a JWS file, which is one standard for making
SOAP-based web services. RecommenderServlet is a 'raw' servlet
wrapper. Both are just wrappers around a Recommender that expose it
over HTTP. Neither is quite REST-ful; both are JavaEE, yes.
You can do anything you want here
Your input is still text though, and I assume your'e trying to use
TextInputFormat. You can't do this as it expects an IntWritable, and
that means it expects input as a sequence file, via
SequenceFileInputFormat.
On Tue, Mar 6, 2012 at 7:21 PM, PEDRO MANUEL JIMENEZ RODRIGUEZ
Mapper compression? -Dmapreduce.map.output.compress=false. I think the
key was mapred.output.compress in Hadoop 0.20.0.
I am not sure if there is reducer compression built-in, but, I could
have missed it.
On Tue, Mar 6, 2012 at 9:40 PM, Luke Forehand
luke.foreh...@networkedinsights.com wrote:
Sure, you just write your own ItemSimilarity implementation based on
the content, whatever that may be. what you do there is mostly up to
you; there's not a framework for this.
On Tue, Mar 6, 2012 at 10:09 PM, Ahmed Abdeen Hamed
ahmed.elma...@gmail.com wrote:
Hello friends,
Is there an example
which is
why I'm trying to override this param). Passing -Dkey=value on the mahout
command line does not seem to have any effect on the mapreduce job
configuration from what I can tell. Any ideas?
-Luke
On 3/6/12 3:48 PM, Sean Owen sro...@gmail.com wrote:
Mapper compression
and in the longterm probably come up
with a cleaner way to do this. Thanks!
-Luke
On 3/6/12 6:24 PM, Sean Owen sro...@gmail.com wrote:
-D arguments are to the JVM so need to be set in HADOOP_OPTS (as I
recall). Or you configure this in your Hadoop config files. It has no
meaning to the driver script. Why
in the first place?
Here is the header of one of the reducer parts that was written into
/mahout/kmeans/clusters-5-final
SEQ org.apache.hadoop.io.Text+org.apache.mahout.clustering.kmeans.Cluster
)org.apache.hadoop.io.compress.SnappyCodec
On 3/6/12 6:33 PM, Sean Owen sro...@gmail.com wrote
I answered on SO:
The only thing I can think of that sounds like this problem is
PageRank. It's computed by a sort of iterative simluation. Each page
has some influence (color) which flows via its links (socks its washed
with) and at some point the page influence reaches a steady state
(final
CassandraDataModel is not related to HMM. Maybe you could be more specific
here.
On Feb 29, 2012 4:43 AM, Srinivas Krishnan shrin.krish...@gmail.com
wrote:
I am currently designing my Data Model for a small cassandra cluster and
wanted to incorporate the HMM model from Mahout. I could not find
Data model, which I am
guessing is support for mapping on to Columns, SuperColumns etc or am I
mistaken ?
-srinivas
On Wed, Feb 29, 2012 at 9:23 AM, Sean Owen sro...@gmail.com wrote:
CassandraDataModel is not related to HMM. Maybe you could be more
specific
here.
On Feb 29, 2012 4:43
to Hadoop.
-srinivas
On Wed, Feb 29, 2012 at 10:30 AM, Sean Owen sro...@gmail.com wrote:
That is for non distributed recomenders, not using Hadoop. For anything
else using Hadoop you use Cassandra by using it as an input to Hadoop. It
is not specific to Mahout.
On Feb 29, 2012 3:23 PM
Caused by: java.lang.IllegalArgumentException: Bad line: 444,25414
This is your problem.
On Wed, Feb 29, 2012 at 12:21 PM, VIGNESH PRAJAPATI
vignesh2...@gmail.comwrote:
Hello Mahout Group,
When i am going to rum my ItemBased Recommender on below given Dataset
structure.It gives me this
Your job file is corrupt or missing. Verify its there and try rebuilding.
On Feb 28, 2012 7:54 AM, manish dunani manishd...@gmail.com wrote:
I am newbie to mahout.
can any body help me out to solve the following error.?
When ever i try to run RecommenderJob over apache hadoop i got the
this I
didn't get any idea.
sean owen:
Your job file is corrupt or missing. Verify its there and try rebuilding.
I am newbie to mahout.
can any body help me out to solve the following error.?
When ever i try to run RecommenderJob over apache hadoop i got the
following error:(Reference
Oh its very easy:
tr ; , in.csv | tr \ out.csv
Or something close.
On Feb 28, 2012 7:31 PM, VIGNESH PRAJAPATI vignesh2...@gmail.com wrote:
Hello Daniel Glauser ,
Thanks for your suggestion, but I have 2,00,000 raws in my Csv file.so
its require great modification. for solution,I want
Definitely a typo in the second passage. Ill fix when I get home unless
someone beats me to it.
On Feb 27, 2012 3:35 PM, Don Smith dsm...@likewise.com wrote:
The documentation for GenericUserPreferenceArray says Like {@link
GenericItemPreferenceArray} but stores preferences for one user (all
I think this thread is talking about at least 4 different things.
1. There is no HBaseDataModel for non-distributed code, that uses
the HBase driver presumably, but could be like there is
CassandraDataModel. That's what I was talking about.
2. You could use a JDBC driver for HBase with
No, it's a library that you run where you like. There's no hosting for
it per se but yeah you could run on Amazon.
On Thu, Feb 16, 2012 at 8:30 AM, VIGNESH PRAJAPATI
vignesh2...@gmail.com wrote:
Hi Folks,
I am new to mahout.I want to know that is there any mahout hosting
provider for Apache
Hmm. I updated it in SVN and thought our fancy new svnpubsub system
was supposed to push that for us.
I'll ask if there's something else we need to do.
On Thu, Feb 16, 2012 at 5:17 PM, Suneel Marthi suneel_mar...@yahoo.com wrote:
Could someone update the Mahout wiki - http://mahout.apache.org
501 - 600 of 1295 matches
Mail list logo