> It mean that BigTable is used for analysis processing with arbitrary
> set of elements by query, not a relational data processing.

Sorry, I think it could easily be understood wrongly.
It's a end of expression of my thoughts.

Let me see about it. :)

On 2/7/08, edward yoon <[EMAIL PROTECTED]> wrote:
> Actually, My most hadoop applications are made for numeric analysis.
> Therefore, I was tried to make a generalized matrix in/out format.
> https://issues.apache.org/jira/browse/HADOOP-2515
> as a Map<row, Map<column, cell>> structure after review the code and
> discuss with gary bradski.
>
> But, If i make a new matrix file structure on Hadoop HDFS, i think it
> could be some resemblancing going on Hbase. So, I think Hadoop + Hbase
> is good fit with matrix management & operation.
>
> "It (BigTable) presents the abstraction of a 2-dimensional
> table of data cells, with different versions over time making
> up a third dimension." -- Failure Trends in a Large Disk Drive Population, 
> 2007
>
> It mean that BigTable is used for analysis processing with arbitrary
> set of elements by query, not a relational data processing.
>
> >  I see http://wiki.apache.org/hadoop/Matrix
>
> Thanks for your review.
> I hope we talk together soon.
>
> On 2/7/08, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> > How do you think these various libraries fit into Hadoop?  Does it
> > make sense to just build what we need using HBase?  I see 
> > http://wiki.apache.org/hadoop/Matrix
> >  does some matrix things, but then it has a Groovy overlay, so it
> > isn't quite what we want, I don't think.
> >
> > Perhaps, we should just think about, and push up to Hadoop if we can,
> > our own set of Hadoop based matrix libraries.  Starting off, we need a
> > decent way to create a matrix and populate it, then also basic matrix
> > things like addition, multiplication, etc.  Then we can add other
> > things as we need them?  For instance, I am interested in TextRank
> > (search for Mihalcea and TextRank) and it essentially comes down to
> > doing an iterative algorithm over a matrix.  I was thinking I might,
> > as a way to get deeper into the latest Hadoop, use it as a sample,
> > useful algorithm.  It's not specifically ML, but it does have
> > interesting results and it is fairly easy to implement.
> >
> > Should we just lay out a page on the Wiki where we can start thinking
> > about matrix needs?  Using other libraries is definitely an option,
> > but I am not sure if they will be optimal in the Hadoop environment.
> >
> > -Grant
> >
> > On Feb 6, 2008, at 12:18 PM, Ted Dunning wrote:
> >
> > >
> > > There are unfortunately many choices for linear algebra in JVM's, none
> > > particularly satisfactory.
> > >
> > > Colt is the one I use.  It has a very odd syntax, but gives good
> > > performance.  The structure is such that it is very hard to extend
> > > to, say,
> > > sparse matrices.  The licensing on Colt isn't particularly easy,
> > > either and
> > > I have been unable to contact the author to see about liberalizing it.
> > >
> > > Jama is now essentially defunct, but it had a very simple API and
> > > not very
> > > high performance.  Extending to additional matrix types is also not
> > > feasible
> > > due to the design exposing matrix internal structure as a double
> > > indexed
> > > matrix.  The licensing on Jama is very open.
> > >
> > > MTJ is high performance and has a less strange API than Colt, but I
> > > haven't
> > > used it so I can't say much about performance.  I get the impression
> > > it
> > > would be difficult to extend, but I could well be wrong about that.
> > >
> > > Commons math uses an extension of Jama, I think.  I haven't used
> > > it.  The
> > > last time I looked seriously at commons math, the committers had
> > > some very
> > > odd agendas going on so I dropped it from consideration.  It looks
> > > like it
> > > has come quite a ways since then, but I haven't dug into it deeply
> > > since my
> > > first evaluation.
> > >
> > >
> > > On 2/6/08 12:45 AM, "Paul Elschot" <[EMAIL PROTECTED]> wrote:
> > >
> > >> Op Wednesday 06 February 2008 05:23:31 schreef Markus Weimer:
> > >>> Hi,
> > >>> One of my contributions to Elefant is an adapter to the Java
> > >>> Version of UIMA
> > >>> which allows you to pipe Python strings through a UIMA annotation
> > >>> engine and
> > >>> get feature vectors to work with back. This was done using JPype: <
> > >>> http://jpype.sourceforge.net/>, a tool which links the JVM to the
> > >>> CPython
> > >>> VM.
> > >>>
> > >>> I choose this non-obvious approach because we use native code Python
> > >>> extensions for the matrix operations, an area where Java
> > >>> regrettably lacks
> > >>> behind big time compared to native code. So, Jython was out of the
> > >>> question
> > >>> as I don't know any way to access a CPython extension from Jython.
> > >>> I found
> > >>> JPype to do the job and to do it well (the overhead per Cross-VM
> > >>> call was
> > >>> around 1ms on my laptop). So for those craving for a state-of-the-
> > >>> art Python
> > >>> with decent extensions and access to Java code, this might be an
> > >>> option.
> > >>
> > >> Well, one of my favourite Java libraries made it into the email
> > >> address of
> > >> this
> > >> list, and I must say, I was hoping to get some good solutions to
> > >> the problem
> > >> of
> > >> linear algebra in a JVM here. Has this problem been discussed
> > >> beforehand?
> > >>
> > >> I have only used linear algebra packages well before there was Java,
> > >> so I wonder how to go about it now.
> > >>
> > >> Regards,
> > >> Paul Elschot
> > >>
> > >
> >
> > --------------------------
> > Grant Ingersoll
> > http://lucene.grantingersoll.com
> > http://www.lucenebootcamp.com
> >
> > Lucene Helpful Hints:
> > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > http://wiki.apache.org/lucene-java/LuceneFAQ
> >
> >
> >
> >
> >
>
>
> --
> B. Regards,
> Edward yoon @ NHN, corp.
>


-- 
B. Regards,
Edward yoon @ NHN, corp.

Reply via email to