> On Nov. 14, 2014, 2:40 a.m., Matthew Hayes wrote: > > datafu-pig/src/main/macros/nlp/tf_idf.pig > > Lines 72 (patched) > > <https://reviews.apache.org/r/27820/diff/1/?file=756916#file756916line72> > > > > Shouldn't this be SUM?
As far as I can tell, it's OK that this is COUNT, if we're counting documents (and as I understand it TF-IDF we're dividing by documents for the IDF part, not actual occurences. - Eyal ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27820/#review61348 ----------------------------------------------------------- On Nov. 10, 2014, 8:33 p.m., Russell Jurney wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/27820/ > ----------------------------------------------------------- > > (Updated Nov. 10, 2014, 8:33 p.m.) > > > Review request for DataFu, pig, Joseph Adler, Jakob Homan, Matthew Hayes, and > Sam Shah. > > > Repository: datafu > > > Description > ------- > > DATAFU-61 - Add TF-IDF Macro to DataFu > > > Diffs > ----- > > datafu-pig/src/main/macros/nlp/tf_idf.pig PRE-CREATION > datafu-pig/src/test/macros/nlp/test_tf_idf.pig PRE-CREATION > > > Diff: https://reviews.apache.org/r/27820/diff/1/ > > > Testing > ------- > > Works for me, but testing not automated. See > https://issues.apache.org/jira/browse/DATAFU-61 > > > Thanks, > > Russell Jurney > >