Thanks all, just subscribed to the mentors list.
Regards,
Tommaso
2014-03-21 10:23 GMT+01:00 Michael McCandless luc...@mikemccandless.com:
ACK from Lucene PMC.
I'm also CC'ing ment...@community.apache.org (Tommaso, you should
subscribe if you haven't already).
Thanks Tommaso! Sad to have
You should also subscribe to code-awards@a.o.
See http://community.apache.org/gsoc.html for details ...
Thanks for being a mentor! We have far too few mentors in Lucene/Solr
unfortunately.
Mike McCandless
http://blog.mikemccandless.com
On Fri, Mar 21, 2014 at 6:23 AM, Tommaso Teofili
2014-03-21 11:35 GMT+01:00 Michael McCandless luc...@mikemccandless.com:
You should also subscribe to code-awards@a.o.
strangely this resulted in qmail-send program replying:
code-awards-subscr...@apache.org:
This mailing list has moved to mentors at community.apache.org.
so I guess
Ahh... the list must have moved. Good to know :)
Mike McCandless
http://blog.mikemccandless.com
On Fri, Mar 21, 2014 at 7:04 AM, Tommaso Teofili
tommaso.teof...@gmail.com wrote:
2014-03-21 11:35 GMT+01:00 Michael McCandless luc...@mikemccandless.com:
You should also subscribe to
Hi Ivan,
It's best to just add a comment onto LUCENE-466 with your
ideas/questions specific to that issue; other more general questions
should be sent to this dev list.
Since the big part of that issue (supporting minShouldMatch in
BooleanQuery) was already done, I think fixing query parsers to
First, thanks so much for getting me pointed in the right direction! I
assume you mean straight on Jira? Also do you have any clue where one would
be able to find past proposals for Lucene?
Thanks,
Ivan
On Wed, Mar 12, 2014 at 12:08 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Hi
Sorry, yes, please add comments/ideas straight on the Jira issue, i.e.
https://issues.apache.org/jira/browse/LUCENE-466 in this case.
Hmm, I'm not sure how to find past proposals. The links to these
proposals, e.g. from my past blog post, and from past Jira issues,
seem to be broken now.
Mike
I think a good place to start is on the issue itself.
E.g. add a comment expressing that you're interested in this issue,
maybe summarize roughly what's entailed. E.g., that issue is quite
old, and the first part of it (supporting minShouldMatch in BQ) has
already been done, so all that remains
Thanks Adrien!
Mike McCandless
http://blog.mikemccandless.com
On Fri, Mar 29, 2013 at 1:49 PM, Adrien Grand jpou...@gmail.com wrote:
Hi,
Although I probably won't be able to mentor students next summer, I
think it would be great to have students this year too. I modified
open JIRA issues
Hello Raimon,
depending on what focus your master thesis should be Lucene / Solr may or
not be the right project.
Basically if your sentiment analysis topic is tight to information
retrieval (very dummy example: making a search engine which scores
documents boosting positive ones) then it could
Hi Tommaso,
Yes, I agree. To use Lucene in this kind of project we would need to focus
on creating sentiment ranking or improve the text classification
capabilities of Lucene. Integration with other might be interesting, also.
Thanks,
Raimon Bosch.
2013/3/20 Tommaso Teofili
Anyone interested?
2013/3/18 Raimon Bosch raimon.bo...@gmail.com
Hi all,
I would be interested in doing a Google Summer of Code this year with
Lucene or Solr. My master thesis topic is about Sentiment analysis, there
is any research in this direction inside Solr and Lucene? If there is any
Since your test uses PerFieldPostingsFormat, its going to write the
name of your format PForDelta into the index and expects to be able
to load it via the SPI mechanism.
So I think you should register your PForDeltaPostingsFormat in
Ah, I see. Thank you Robert !
On Tue, May 1, 2012 at 2:46 AM, Robert Muir rcm...@gmail.com wrote:
Since your test uses PerFieldPostingsFormat, its going to write the
name of your format PForDelta into the index and expects to be able
to load it via the SPI mechanism.
So I think you should
Hi,
here's my first suggestion for the Refactoring steps:
By now is the IW-class very big and i would try to reduce the code,
by delegate special functions to the new components (Pattern: SRP).
So keeps the IndexWriter most of his APIs and delegates only.
I would try to extract the internals
Hey Simon,
thx for your fast response!
to begin with make sure you read this:
http://wiki.apache.org/lucene-java/SummerOfCode2012
http://wiki.apache.org/lucene-java/HowToContribute
Okay, i read the documentation.
Yeah we have multiple test for IndexWriter (IW in short) the are all
Hey Tim, great to have you!
to begin with make sure you read this:
http://wiki.apache.org/lucene-java/SummerOfCode2012
On Wed, Apr 4, 2012 at 12:20 AM, Achmetow (Google)
achmeto...@googlemail.com wrote:
Hi,
I am a student from Germany and would like to contribute to the ASF Lucene
project.
On Mon, Mar 26, 2012 at 6:59 PM, Han Jiang jiangha...@gmail.com wrote:
Hi all,
I was trying to figure out the control flow of IndexWriter and
IndexSearcher, in order to get a better understanding of the idea behind
Codec implementation.
However, there seem to be some questions related with
Hello,
One quick question up front: are you subscribed to the dev list? If
not, you may have missed my response to your last email with GSoC
questions:
http://lucene.markmail.org/thread/lqv6lyql2nlagv7f#query:+page:1+mid:ubjsvvfviuaexqlo+state:results
Answers below:
On Fri, Mar 23,
Hello! Answers below...:
On Wed, Mar 21, 2012 at 11:03 AM, Han Jiang jiangha...@gmail.com wrote:
Hi All,
I'm Billy, a senior undergraduate student in Peking University. I'm working
in the area of Information Retrieval and Web Mining. When going through the
idea list, I felt quite interested
Mark, can you open an issue for this and lable it as:
gsoc2012
lucene-gsoc-12
mentor
just like this one https://issues.apache.org/jira/browse/LUCENE-2562
thanks,
simon
On Fri, Mar 2, 2012 at 12:26 PM, mark harwood markharw...@yahoo.co.uk wrote:
Does anyone have any ideas?
A framework for
On Fri, Mar 2, 2012 at 11:30 AM, Robert Muir rcm...@gmail.com wrote:
Hello,
I was asked by a student if we are participating in GSOC this year. I
hope the answer is yes?
If we are planning to, I think it would be good if we came up with a
list on the wiki of potential tasks. Does anyone
Does anyone have any ideas?
A framework for match metadata?
Similar to the way tokenization was changed to allow tokenizers to to enrich a
stream of tokens with arbitrary attributes, Scorers could provide
MatchAttributes to provide arbitrary metadata about the stream of matches
they produce.
I created an initial GSOC 2012 page here:
http://wiki.apache.org/lucene-java/SummerOfCode2012
simon
On Fri, Mar 2, 2012 at 12:26 PM, mark harwood markharw...@yahoo.co.uk wrote:
Does anyone have any ideas?
A framework for match metadata?
Similar to the way tokenization was changed to allow
Thanks for helping to get this started Simon and Mark!
On Fri, Mar 2, 2012 at 7:10 AM, Simon Willnauer
simon.willna...@googlemail.com wrote:
I created an initial GSOC 2012 page here:
http://wiki.apache.org/lucene-java/SummerOfCode2012
simon
On Fri, Mar 2, 2012 at 12:26 PM, mark harwood
2011/5/12 Michael McCandless luc...@mikemccandless.com
2011/5/9 Nikola Tanković nikola.tanko...@gmail.com:
Introduction of an FieldType class that will hold all the extra
properties
now stored inside Field instance other than field value itself.
Seems like this is an easy first
2011/4/13 Nikola Tanković nikola.tanko...@gmail.com:
Hi all,
if everything goes well I'll be delighted to be part of this project this
summer together with my assigned mentor Mike. My task will be to introduce
new classes to Lucene core which will enable to separate Fields' Lucene
properties
Done!
--- Em qua, 6/4/11, Adriano Crestani adrianocrest...@apache.org escreveu:
De: Adriano Crestani adrianocrest...@apache.org
Assunto: GSoC Lucene proposals
Para: dev@lucene.apache.org
Data: Quarta-feira, 6 de Abril de 2011, 22:43
Hi students,
We are receiving very good proposals this year, I
Hi Phillipe,
You could start taking a look at these projects:
LUCENE-2979 https://issues.apache.org/jira/browse/LUCENE-2979
https://issues.apache.org/jira/browse/LUCENE-2979LUCENE-2309https://issues.apache.org/jira/browse/LUCENE-2309
Hey Simon and all,
May we get an update on this? I understand that Google has published the list
of accepted organizations, which -- not surprisingly -- includes the ASF. Is
there any information on how many slots Apache got, and which issues will be
selected?
The student application period
On Wed, Mar 23, 2011 at 9:37 AM, David Nemeskey
nemeskey.da...@sztaki.hu wrote:
Hey Simon and all,
May we get an update on this? I understand that Google has published the list
of accepted organizations, which -- not surprisingly -- includes the ASF. Is
there any information on how many slots
Ok, I have created a new issue, LUCENE-2959 for this project. I have uploaded
the pdfs and added the gsoc2011 and lucene-gsoc-2011 labels as well.
David
On 2011 March 09, Wednesday 21:58:53 Simon Willnauer wrote:
On Wed, Mar 9, 2011 at 5:48 PM, Grant Ingersoll gsing...@apache.org wrote:
I
awesome thanks!
simon
On Thu, Mar 10, 2011 at 11:54 AM, David Nemeskey
nemeskey.da...@sztaki.hu wrote:
Ok, I have created a new issue, LUCENE-2959 for this project. I have uploaded
the pdfs and added the gsoc2011 and lucene-gsoc-2011 labels as well.
David
On 2011 March 09, Wednesday
On Wed, Mar 9, 2011 at 3:58 PM, Simon Willnauer
simon.willna...@googlemail.com wrote:
On Wed, Mar 9, 2011 at 5:48 PM, Grant Ingersoll gsing...@apache.org wrote:
I think we, Lucene committers, need to identify who is willing to mentor.
In my experience, it is less than 5 hours a week. Most
I think we, Lucene committers, need to identify who is willing to mentor.In
my experience, it is less than 5 hours a week. Most of the work is done as
part of the community. Sometimes you have to be tough and fail someone (I did
last year) but most of the time, if you take the time to
On Wed, Mar 9, 2011 at 5:48 PM, Grant Ingersoll gsing...@apache.org wrote:
I think we, Lucene committers, need to identify who is willing to mentor.
In my experience, it is less than 5 hours a week. Most of the work is done
as part of the community. Sometimes you have to be tough and
Hey David and all others who want to contribute to GSoC,
the ASF has applied for GSoC 2011 as a mentoring organization. As a
ASF project we don't need to apply directly though but we need to
register our ideas now. This works like almost anything in the ASF
through JIRA. All ideas should be
I think that is good for now. I should get started on codeawards and
wrap up our proposals. I hope I can do that this week.
simon
On Tue, Feb 22, 2011 at 3:16 PM, David Nemeskey
nemeskey.da...@sztaki.hu wrote:
Hey,
I have written the proposal. Please let me know if you want more / less of
nemeskey.da...@sztaki.hu
Enviado: martes, 22 de febrero, 2011 11:22:57
Asunto: Re: GSoC
I think that is good for now. I should get started on codeawards and
wrap up our proposals. I hope I can do that this week.
simon
On Tue, Feb 22, 2011 at 3:16 PM, David Nemeskey
nemeskey.da...@sztaki.hu wrote:
Hey
Hi guys,
Mark, Robert, Simon: thanks for the support! I really hope we can work
together this summer (and before that, obviously).
According to http://www.google-
melange.com/document/show/gsoc_program/google/gsoc2011/timeline , there's
still some time until the application period. So let me
Hey David,
I saw that you added a tiny line to the GSoC Lucene wiki - thanks for that.
On Wed, Feb 2, 2011 at 10:10 AM, David Nemeskey
nemeskey.da...@sztaki.hu wrote:
Hi guys,
Mark, Robert, Simon: thanks for the support! I really hope we can work
together this summer (and before that,
On Feb 2, 2011, at 4:10 AM, David Nemeskey wrote:
Hi guys,
Mark, Robert, Simon: thanks for the support! I really hope we can work
together this summer (and before that, obviously).
Sounds like a great idea. Looking forward to the proposal.
According to http://www.google-
+1 the proposal. We already have a committer digging into this area - he would
make a perfect GSoC mentor! And would likely love the help.
His response likely to follow...
- Mark
On Jan 28, 2011, at 11:32 AM, David Nemeskey wrote:
Hi all,
I have already sent this mail to Simon Willnauer,
On Fri, Jan 28, 2011 at 5:42 PM, Mark Miller markrmil...@gmail.com wrote:
+1 the proposal. We already have a committer digging into this area - he
would make a perfect GSoC mentor! And would likely love the help.
same here +1 - if there is mentoring needed I will be there too.
Robert I
On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey
nemeskey.da...@sztaki.hu wrote:
Hi all,
I have already sent this mail to Simon Willnauer, and he suggested me to post
it here for discussion.
I am David Nemeskey, a PhD student at the Eotvos Lorand University, Budapest,
Hungary. I am doing an
Thanks guys! So happy to get it, and really excited that Mahout got 5 slots.
@Robin: I'm totally up for a shared blog, was planning on blogging about
it anyway.
Robin Anil wrote:
Congrats everyone.And a special thanks to Benson for helping us get the
slots to 5 this year :)
For students
Thanks everyone! I am so exciting to be accepted and I will do my best to
finish my proposal in time.
A shared blog sounds great to me. The GSoC looks like a training, we suppose
to share the experience with all who interested in Mahout project.
Cheers,
Zhendong
On Tue, Apr 27, 2010 at 3:22 PM,
+1 for shared blog!
Thanks. It's great to finally have the chance to be a part of Apache
Mahout. Congratulations to everyone who got selected!
+1 for the shared blog idea!
On Tue, Apr 27, 2010 at 12:52 PM, Robin Anil robin.a...@gmail.com wrote:
Congrats everyone.And a special thanks to Benson for helping us
Thanks everyone!
This is a fantastic opportunity, and I'll try to make the best of this for
myself, as well as Mahout. Hopefully, we'll have a great compilation of deep
learning networks within the next few releases.
BTW, congrats to everyone on Mahout becoming a TLP!
On Tue, Apr 27, 2010 at
Timeline including Apache internal deadlines:
http://cwiki.apache.org/confluence/display/COMDEVxSITE/GSoC
Mentors, please also click on the ranking link to the ranking explanation [1]
for more information on how to rank student proposals.
Isabel
[1]
Hi Grant,
Could you please give us the link of this page?
Cheers,
Zhendong
On Wed, Mar 31, 2010 at 8:53 PM, Grant Ingersoll gsing...@apache.orgwrote:
I created a Wiki page on GSOC. I hope everyone considering GSOC reads it.
Mentors, please add as you see fit. Would be good to get a Mahout
D'oh! My bad: http://cwiki.apache.org/MAHOUT/gsoc.html. It's linked from the
front wiki page under community.
-Grant
On Mar 31, 2010, at 9:11 AM, zhao zhendong wrote:
Hi Grant,
Could you please give us the link of this page?
Cheers,
Zhendong
On Wed, Mar 31, 2010 at 8:53 PM, Grant
Ha, thanks.
On Wed, Mar 31, 2010 at 9:29 PM, Grant Ingersoll gsing...@apache.orgwrote:
D'oh! My bad: http://cwiki.apache.org/MAHOUT/gsoc.html. It's linked from
the front wiki page under community.
-Grant
On Mar 31, 2010, at 9:11 AM, zhao zhendong wrote:
Hi Grant,
Could you please
Hi Tanya,
MAHOUT-328 is just a general stub. There is no detailed project
description other than what is given there. The idea is we let you propose
to implement a clustering algorithm in Mahout. Start here
http://cwiki.apache.org/MAHOUT/gsoc.html. Browse through the Wiki. Look at
On Mon Robin Anil robin.a...@gmail.com wrote:
2. UIMA Integration with Mahout? (Maybe a good project if UIMA folks
are taking in GSOC students)
I guess one could easily split this one in two:
a) Using UIMA (whole pipeline or just the analysers if that is possible)
for data pre-processing
On Wed Robin Anil robin.a...@gmail.com wrote:
Greetings! Fellow GSOC alums, administrators and dear mentors, the
next edition is right here. Details are given in the link below.
https://groups.google.com/group/google-summer-of-code-discuss/browse_thread/thread/d839c0b02ac15b3f
Some
Some more Wild and Wacky Ideas. Might be out of scope for GSOC, but are nice
to have features for mahout. I would like to encourage all of you to put
down your ideas here.
1. Data Visualization tool backed with HDFS/Hbase for inspecting clusters,
Topic model etc etc
- It could have many
done.
--- En date de : Mar 8.9.09, Grant Ingersoll gsing...@apache.org a écrit :
De: Grant Ingersoll gsing...@apache.org
Objet: [GSOC] Code Submissions
À: Mahout Dev List mahout-dev@lucene.apache.org
Date: Mardi 8 Septembre 2009, 13h09
Hi Robin, David and Deneche,
You will need to submit
I filled out one for Deneche.
On Tue, Jul 7, 2009 at 9:32 AM, deneche abdelhakim a_dene...@yahoo.frwrote:
The students mid-term survey is available online. I'm posting this because
I almost forgot it =P
--- En date de : Mer 17.6.09, Grant Ingersoll gsing...@apache.org a
écrit :
De:
On Tuesday 07 July 2009 20:34:09 Ted Dunning wrote:
I filled out one for Deneche.
I submitted the one for Robin yesterday evening.
Isabel
--
QOTD: Produtos desenvolvidos para todo tipo de idiota * Impresso no fundo,
embaixo, de uma sobremesa tiramisudo Tesco: ``N�o vire de ponta cabe�a.''
Very similar, but I was talking about building trees on each split of the
data (a la map reduce split).
That would give many small splits and would thus give very different results
from bagging because the splits would be small and contigous rather than
large and random.
On Thu, Jun 18, 2009 at
On Tuesday 12 May 2009 19:50:21 Grant Ingersoll wrote:
http://socghop.appspot.com/document/show/program/google/gsoc2009/timeline
May 23. Hope all of our students and mentors are ready to go.
I certainly am*.
Isabel
* Might be a bit distracted on that exact day though: It's my birthday ;)
It's also helpful to get yourself a Wiki account and a JIRA account if
you don't already have them. Small patches to the existing docs/code
can also help you figure out the process
On Apr 21, 2009, at 1:19 PM, Isabel Drost wrote:
On Tuesday 21 April 2009 08:30:34 David Hall wrote:
As
Thanks everyone!
-- David
On Thu, Apr 23, 2009 at 12:53 PM, Grant Ingersoll gsing...@apache.org wrote:
It's also helpful to get yourself a Wiki account and a JIRA account if you
don't already have them. Small patches to the existing docs/code can also
help you figure out the process
On
...@cs.stanford.edu a écrit :
De: David Hall d...@cs.stanford.edu
Objet: Re: [GSOC] Accepted Students
À: mahout-dev@lucene.apache.org
Date: Mardi 21 Avril 2009, 8h30
On Mon, Apr 20, 2009 at 11:18 PM,
deneche abdelhakim a_dene...@yahoo.fr
wrote:
Hi,
=D
I've been accepted. And I'll
.
* know how to run an example in Hadoop, at least in pseudo-distributed:
http://hadoop.apache.org/core/docs/current/quickstart.html
--- En date de : Mar 21.4.09, David Hall d...@cs.stanford.edu a écrit
:
De: David Hall d...@cs.stanford.edu
Objet: Re: [GSOC] Accepted Students
À
:
De: David Hall d...@cs.stanford.edu
Objet: Re: [GSOC] Accepted Students
À: mahout-dev@lucene.apache.org
Date: Mardi 21 Avril 2009, 8h30
On Mon, Apr 20, 2009 at 11:18 PM,
deneche abdelhakim a_dene...@yahoo.fr
wrote:
Hi,
=D
I've been accepted. And I'll be working
On Tuesday 21 April 2009 08:30:34 David Hall wrote:
As for questions, what am I supposed to be reading during this
community building period? I see:
* http://cwiki.apache.org/MAHOUT/howtocontribute.html
* http://www.apache.org/foundation/how-it-works.html
plus skimming javadocs.
These are
Hi
I decided to go with the mixture model for EM.
I have modified my proposal and submit it both on gsoc website and apache wiki.
Best Regards
Yifan
2009/4/1 Yifan Wang heavens...@gmail.com:
I will choose Mixture Model for the EM implementation.
Yifan
2009/4/1 Ted Dunning
I would hope that your SVD implementation would not be limited to NetFlix
like problems, but would be applicable to any reasonably sparse matrix-like
data.
Likewise, I would expect a good SVD implementation to be useful for nearest
neighbor methods or direct prediction by smoothing the history
On Wed, Apr 1, 2009 at 1:30 AM, Ted Dunning ted.dunn...@gmail.com wrote:
I would hope that your SVD implementation would not be limited to NetFlix
like problems, but would be applicable to any reasonably sparse matrix-like
data.
Yes, ofcourse. it would apply to any large sparse matrix
Thanks David, that helped.
On Wed, Apr 1, 2009 at 1:47 AM, David Hall d...@cs.stanford.edu wrote:
On Tue, Mar 31, 2009 at 11:43 PM, Atul Kulkarni atulskulka...@gmail.com
wrote:
questions in line.
On Wed, Apr 1, 2009 at 1:27 AM, Ted Dunning ted.dunn...@gmail.com
wrote:
Nobody is
I'm preparing an application, but haven't submitted yet as I was
waiting on confirmation of my student status... as I now know that I'm
going to be eligible I'll get my application in soon :)
2009/4/1 Ted Dunning ted.dunn...@gmail.com:
I only see two applications for Mahout, one reasonably
Hmm, I see several in there, but they aren't all labeled w/ Mahout, so
that may be why. I also expanded to see 100 at a time.
-Grant
On Mar 31, 2009, at 8:43 PM, Ted Dunning wrote:
I only see two applications for Mahout, one reasonably strong, one
much less
so.
Are there students out
The other thing to note, here, is that people should be aware that the
ASF is only going to get a certain number of slots from Google (last
year, it was somewhere in the 30-40 range, I think), which are
distributed across all projects that have expressed an interest in
mentoring. While
The machinery of SVD is almost always described in terms of least squares
matrix approximation without mentioning the probabilistic underpinnings of
why least-squares is a good idea. The connection, however, goes all the way
back to Gauss' reduction of planetary position observations (this is
Let me second that. When I am hiring a student without professional
experience, it is almost a perfect predictor that if they have done
significant work on a significant outside project they will get an interview
with me and if not, they won't.
Moreover, if I have a candidate at any level who
Hi Yifan,
I think both are good candidates, although AIUI, SVM is a bit harder
to parallelize, so maybe it would make sense to focus on EM. Of
course, we don't have to be distributed, so you could propose a non-
distributed SVM implementation as a first cut and then work on the
Yifan,
EM is a highly non-specific term and covers a huge range of very different
algorithms. For example, pLSI, HMM's, and mixture models can all be
estimated using EM.
What exactly did you mean to address with an EM implementation?
On Wed, Apr 1, 2009 at 1:05 PM, Grant Ingersoll
I will choose Mixture Model for the EM implementation.
Yifan
2009/4/1 Ted Dunning ted.dunn...@gmail.com:
Yifan,
EM is a highly non-specific term and covers a huge range of very different
algorithms. For example, pLSI, HMM's, and mixture models can all be
estimated using EM.
What exactly
On Wed, Apr 1, 2009 at 7:12 PM, Xuan Yang sailingw...@gmail.com wrote:
Hello everyone,
This is my proposal draft.
BTW remember http://markmail.org/message/rbwp2hf6iipc2ut3
- robert
Thanks, I have submited it there. :)
2009/4/2 Robert Burrell Donkin robertburrelldon...@gmail.com:
On Wed, Apr 1, 2009 at 7:12 PM, Xuan Yang sailingw...@gmail.com wrote:
Hello everyone,
This is my proposal draft.
BTW remember http://markmail.org/message/rbwp2hf6iipc2ut3
- robert
--
Here is a draft of my proposal
**
Title/Summary: [Apache Mahout] Implement parallel Random/Regression Forests
Student: AbdelHakim Deneche
Student e-mail: ...
Student Major: Phd in Computer Science
Student Degree: Master in Computer Science
I only see two applications for Mahout, one reasonably strong, one much less
so.
Are there students out there who still need to prepare an application?
The deadline is coming up fast.
2009/3/31 Grant Ingersoll gsing...@apache.org
FYI: http://wiki.apache.org/general/RankingProcess
-Grant
Deneche,
I don't see your application on the GSOC web site. Nor on the apache wiki.
Time is running out and I would hate to not see you in the program. Is it
just that I can't see the application yet?
On Tue, Mar 31, 2009 at 1:05 PM, deneche abdelhakim a_dene...@yahoo.frwrote:
Here is a
in the node hard-drive, and thus must be
distributed across the cluster.
abdelHakim
--- En date de : Lun 30.3.09, Ted Dunning ted.dunn...@gmail.com a écrit :
De: Ted Dunning ted.dunn...@gmail.com
Objet: Re: [gsoc] random forests
À: mahout-dev@lucene.apache.org
Date: Lundi 30 Mars 2009, 0h59
I
Indeed. And those datasets exist.
It is also plausible that this full data scan approach will fail when you
want the forest building to take less time.
It is also plausible that a full data scan approach fails to improve enough
on a non-parallel implementation. This would happen if a
I suggest that we all learn from the experience you are about to have on the
reference implementation.
And, yes, I did mean the reference implementation when I said
non-parallel. Thanks for clarifying.
On Mon, Mar 30, 2009 at 10:45 AM, deneche abdelhakim a_dene...@yahoo.frwrote:
What do you
I have two answers for you.
The first is that for any given application, the odds that the data will not
fit in a single machine are small, especially if you have an out-of-core
tree builder. Really, really big datasets are increasingly common, but are
still a small minority of all datasets.
you should read in . 2a
. This implementation is, relatively, easy given...
--- En date de : Sam 28.3.09, deneche abdelhakim a_dene...@yahoo.fr a écrit :
De: deneche abdelhakim a_dene...@yahoo.fr
Objet: Re: [gsoc] random forests
À: mahout-dev@lucene.apache.org
Date: Samedi 28 Mars 2009
Graph ranking strategies are something I am very much interested in
and would love to see in Mahout. Please do propose.
-Grant
On Mar 24, 2009, at 6:00 AM, Xuan Yang wrote:
Hello everyone,
I am a student from Fudan University, Shanghai, China.
These days I am doing some research work on
ok~ I will do it asap~
btw, I there any advices?
thanks a lot~ :)
2009/3/24 Grant Ingersoll gsing...@apache.org
Graph ranking strategies are something I am very much interested in and
would love to see in Mahout. Please do propose.
-Grant
On Mar 24, 2009, at 6:00 AM, Xuan Yang wrote:
would be
interested in the first...but of course if actually the community need them
both :)
--- En date de : Mar 24.3.09, Ted Dunning ted.dunn...@gmail.com a écrit :
De: Ted Dunning ted.dunn...@gmail.com
Objet: Re: GSoC 2009-Discussion
À: mahout-dev@lucene.apache.org
Date: Mardi 24 Mars 2009
Answering some of your email out of order,
On Mon, Mar 23, 2009 at 10:00 PM, Xuan Yang sailingw...@gmail.com wrote:
These days I am doing some research work on SimRank, which is an model
measuring similarity of objects.
Great.
I think it would be great to solve these problems and
[snip]
a web crawler. By doing this, a crawler, for instance, can use the
output of the classification to only follow certain links that lie on
informative content parts.
Is this interesting make sense for you guys?
Hi Samuel. This would be of great interest for the Nutch folks, I
Mmmm :) This would definitely be very useful to anyone dealing with web
page parsing and indexing.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Samuel Louvan samuel.lou...@gmail.com
To: mahout-dev@lucene.apache.org
Sent: Sunday,
Hi guys,
I'm actually interested with your project. I haven't started my proposal
yet, because I'm still working on my finals now, I'll be writing it soon and
let you guys know any updates. But I'm generally interested this idea:
http://wiki.apache.org/general/SummerOfCode2008#lucene
I had
Hi Z.S.,
I'll update LUCENE-1313 after LUCENE-1516 is committed. I can post the
basic new patch I have for LUCENE-1313 (heavily simplified compared to the
previous patches), however it will assume LUCENE-1516. The other area that
will need to be addressed is standard benchmarking for different
I think creating a better Highlighter for Lucene, which is actively
being discussed:
https://issues.apache.org/jira/browse/LUCENE-1522
would make a good GSoC project, but I don't think I have time to mentor.
Realtime search is currently in progress already, being tracked/iterated
here:
1 - 100 of 139 matches
Mail list logo