Re: [Lucy] Roadmap for first release

2010-07-29 Thread Peter Karman
think we should still do just that: shunt all attention and communication to Lucy, by listing the Lucy website and mailing lists in the KinoSearch documentation under SUPPORT. +1 -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Roadmap for first release

2010-07-29 Thread Peter Karman
-- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Re: Set up for new mailing lists

2010-07-25 Thread Peter Karman
Marvin Humphrey wrote on 7/25/10 3:07 PM: peter AT peknet DOT com yes, please. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Re: [Lucy Wiki] Update of LucyIncubatorProposal by PeterKarman

2010-06-29 Thread Peter Karman
which increases quality of contribution. I'm not sure where to put those items; community is one possibility. Looks like you have this under Alignment now? -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Divide user mailing lists by host language

2010-06-28 Thread Peter Karman
end up being illuminating to non-users of the language. Anyway, is it a big deal to add more lists later, based on demand, rather than splitting them up in the initial proposal? -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Sponsoring Entity

2010-06-28 Thread Peter Karman
#Sponsor -- Peter Karman . http://peknet.com/ . pe...@peknet.com

[jira] Updated: (LUCY-114) compile failure on OS X 10.6

2010-06-19 Thread Peter Karman (JIRA)
[ https://issues.apache.org/jira/browse/LUCY-114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karman updated LUCY-114: -- Attachment: align_signature.patch The inline patch came out garbled. Same patch attached. compile

[jira] Created: (LUCY-115) nullable attribute not propagated

2010-06-19 Thread Peter Karman (JIRA)
Reporter: Peter Karman In Clownfish files (.bp) such as KinoSearch/Search/Compiler.bp, certain methods are defined as nullable but that nullable attribute is not being propagated to the _OVERRIDE generated code. -- This message is automatically generated by JIRA. - You can reply to this email

[jira] Created: (LUCY-116) Build.PL opts not supported

2010-06-19 Thread Peter Karman (JIRA)
: Peter Karman The Build.PL docs claim that --config cc= should work but it does not. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.

[jira] Updated: (LUCY-116) Build.PL opts not supported

2010-06-19 Thread Peter Karman (JIRA)
[ https://issues.apache.org/jira/browse/LUCY-116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karman updated LUCY-116: -- Attachment: get_cc.patch pass_cc.patch The attached patches implement the documented

ANNOUNCE: KinoSearch 0.30_101 released

2010-05-01 Thread Peter Karman
offset was ignored. * r6079-6081: Fix problem with corruption of the Perl stack by Host_callback(). -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: get_score == nan

2010-04-06 Thread Peter Karman
Marvin Humphrey wrote on 04/06/2010 01:20 PM: On Tue, Apr 06, 2010 at 11:56:56AM -0500, Peter Karman wrote: Is there any legitimate way that $hitdoc-get_score would return a value that Perl stringifies to 'nan'? When using a SortSpec that doesn't include scoring, yes. In that case

Re: [Lucy] FreeBSD builds

2010-04-04 Thread Peter Karman
-- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] FreeBSD builds

2010-04-02 Thread Peter Karman
Peter Karman wrote on 4/2/10 9:36 PM: Marvin Humphrey wrote on 3/30/10 2:58 PM: Wish I had access to a FreeBSD 8.x box so I could figure out what went wrong here: http://www.cpantesters.org/cpan/report/7013414 Still broken, same problem: http://www.cpantesters.org/cpan/report

Re: [Lucy] Portability of KS 0.30_09 to various Unixen

2010-03-28 Thread Peter Karman
this way. Sysadmin-y stuff is a PITA. I can help in that way though. I'll try and set up some VMs for FreeBSD 8.0 and OpenSolaris to make it easier to test things. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Snapshot: segments key

2010-03-27 Thread Peter Karman
-- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Snapshot: segments key

2010-03-26 Thread Peter Karman
Marvin Humphrey wrote on 3/26/10 4:04 PM: After the change, it would look like this: { entries : [ schema_3.json, ], segments : [ seg_3 ], format : 1 } sounds good. -- Peter Karman . http://peknet.com/ . pe

Re: [Lucy] Logged dev IRC channel

2010-03-26 Thread Peter Karman
that. Does the logging really help the community? Or is it simply handy for your own memory-jogging? I'm not faulting the latter, just wondering. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: ProximityQuery

2010-03-22 Thread Peter Karman
, but within is a little easier to spell and just has a slightly more natural language linguistic emphasis as opposed to more traditional noun = value naming style. I like 'within' -- easy (enough) to remember and type. changes pushed in r5942. thanks for the thorough review, Marvin. -- Peter

Re: ProximityQuery

2010-03-21 Thread Peter Karman
Peter Karman wrote on 3/19/10 10:08 PM: I'm going to dive into the Proximity classes now and see if I can break them. r5936 implements ProximityQuery for KS. Marvin, please have a look when you have a chance, and let me know what needs changing. In the end it was a one-line difference

Re: ProximityQuery

2010-03-21 Thread Peter Karman
Marvin Humphrey wrote on 3/21/10 3:07 PM: On Sun, Mar 21, 2010 at 02:01:41AM -0500, Peter Karman wrote: Marvin, please have a look when you have a chance, and let me know what needs changing. The current implementation has a limitation I think is probably pretty important: 'b NEAR

Re: ProximityQuery

2010-03-20 Thread Peter Karman
. Wish I had it months ago. :) Now I understand the function-vs-method capitalization. thanks, Marvin. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: ProximityQuery

2010-03-19 Thread Peter Karman
) and were what sparked my initial question to the list. I'm going to dive into the Proximity classes now and see if I can break them. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: ProximityQuery

2010-03-16 Thread Peter Karman
the Wildcard features. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] MoreLikeThisQuery

2010-03-16 Thread Peter Karman
look at all those terms in a vector space, most of them will be clustered together, but one will be way far away. wouldn't POS tagging achieve the same ends? Or even a look-up lexicon of nouns? -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Re: MoreLikeThisQuery

2010-03-16 Thread Peter Karman
reduction. I've been looking at the SenseClusters package. It's very unfriendly to use, but the ideas in it are worth some investigation. http://www.d.umn.edu/~tpederse/senseclusters.html It uses SVDPACKC to do the big matrix math: http://netlib.org/svdpack/ -- Peter Karman . http://peknet.com

ProximityQuery

2010-03-15 Thread Peter Karman
feature, specifically the PhraseScorer and the internal winnow_anchors() function. Am I on the right track here? [0] I believe Lucene syntax for that query is foo bar~10 -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: Stable releases under new namespaces

2010-03-15 Thread Peter Karman
manage) versioning. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: Stable releases under new namespaces

2010-03-14 Thread Peter Karman
Marvin Humphrey wrote on 3/14/10 7:21 PM: On Sun, Mar 14, 2010 at 02:30:48PM -0500, Peter Karman wrote: I like this approach. It's more work for KS developers, in terms of managing api versions, but that's good work to do, since it requires making the api changes very explicit. I understand

Re: [Lucy] Re: Clownfish nullable bug

2010-03-05 Thread Peter Karman
Marvin Humphrey wrote on 3/3/10 10:16 AM: On Tue, Mar 02, 2010 at 11:16:49PM -0600, Peter Karman wrote: fyi, I've just committed a failing test to KS trunk as r5886. The issue manifests when an abstract nullable method is overridden in a subclass. The 'nullable' flag is not being parsed

Clownfish nullable bug

2010-03-02 Thread Peter Karman
in the grammar passed to Parse::RecDescent in Clownfish::Parser but that RecDescent stuff is some serious voodoo and I don't even know where to start. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Index-time RAM consumption settings (was Invalid UTF-8)

2010-02-03 Thread Peter Karman
a flush threshold, that is often mentioned as a way to control indexing speed vs. memory use. From what I've seen of KS the indexing speed is much faster and mem use much lower anyway, so I'm not worrying about it. Thanks for the detailed reply re: the issues involved. -- Peter Karman . http

Re: [Lucy] Re: SortCache on a 32-bit OS

2010-02-01 Thread Peter Karman
not realized that. This softens my position considerably. I'm all for making increasing legacy performance so long as it doesn't complicate the mainline architecture. ++ This is an interesting thread, though I have nothing interesting to add. :) -- Peter Karman . http://peknet.com

Re: Invalid UTF-8

2010-01-27 Thread Peter Karman
Peter Karman wrote on 1/27/10 10:43 PM: The OSX behaviour was weird. First time it segfaulted. Ran it again under gdb and it completed ok. Ran it again without gdb and I got this: ignore these complaints. seems my os and/or fs was/is seriously fscked. -- Peter Karman . http://peknet.com

Re: memory use

2010-01-24 Thread Peter Karman
comparing apples with apples. I ran 0.30072 with fewer sortable fields, since it doesn't support sortable FullTextType, and I ran svn trunk with all FullTextType fields sortable. So the memory use could be due to that. I will rerun more comparable config and report back. -- Peter Karman

Re: [kinosearch-commits] r5736 - trunk/perl/buildlib/KinoSearch

2010-01-23 Thread Peter Karman
!~ /-march=\w+/ ) { +$extra_ccflags .= -march=i486 ; +} that flag, while necessary on 32bit archs that I've tried, breaks compilation immediately under 64bit archs with the message: cc1: CPU you selected does not support x86-64 instruction set -- Peter Karman . http

memory use

2010-01-23 Thread Peter Karman
to be expected? -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: Release strategies

2010-01-22 Thread Peter Karman
Marvin Humphrey wrote on 01/21/2010 11:01 PM: So that's one item checked off the TODO list. \o/ -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

2010-01-20 Thread Peter Karman
Nathan Kurz wrote on 1/20/10 1:35 AM: Could you attach your failing standalone test case so I can take a quick look at it? I tried the inline one above, but saw nothing strange with GCC 4.2.4 on Slamd64. http://rectangular.com/pipermail/kinosearch/2010-January/007228.html -- Peter Karman

Re: 64-bit linux errors with t/core/032-string_helper.t

2010-01-20 Thread Peter Karman
://peknet.com/~karpet/char-with-O2.txt http://peknet.com/~karpet/char-without-O2.txt -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

2010-01-20 Thread Peter Karman
of. I'm currently downloading gcc 3.4.6 (the last of the 3.4.x releases fwiw, from 2006), and will try duplicating the problem on a different platform with the same compiler version. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: 64-bit linux errors with t/core/032-string_helper.t

2010-01-19 Thread Peter Karman
Marvin Humphrey wrote on 01/19/2010 01:51 PM: On Tue, Jan 19, 2010 at 01:37:12PM -0600, Peter Karman wrote: ok - 0 == 0 ok - 1 == 1 ok - 100 == 100 ok - 126 == 126 ok - 127 == 127 ok - 128 == 128 ok - 129 == 129 ok - 250 == 250 ok - 254 == 254 ok - 255 == 255 Whew. If that's

Re: 64-bit linux errors with t/core/032-string_helper.t

2010-01-19 Thread Peter Karman
for awhile and hope that the answer just comes to me in my sleep. Can we tag out? If you can get me an account on one of these boxen, or if we can duplicate this behavior on an Amazon EC2 instance, I'd like to take a crack at it. I'll work on that and mail you offlist. -- Peter Karman

64-bit linux errors with t/core/032-string_helper.t

2010-01-18 Thread Peter Karman
but no memory problems %d, i); +ASSERT_INT_EQ(batch, StrHelp_UTF8_TRAILING[i], 7, +UTF8_TRAILING bogus but no memory problems %d, i); } } -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [KinoSearch] c99

2010-01-15 Thread Peter Karman
KinoSearch/Util/ToolSet.h #include ctype.h -- Peter Karman . http://peknet.com/ . pe...@peknet.com

C90 vs C99

2010-01-14 Thread Peter Karman
. Which makes me wonder: what standard are we aiming for, and do we need to be telling the compiler that? -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: C90 vs C99

2010-01-14 Thread Peter Karman
/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux Thread model: posix gcc version 3.4.6 20060404 (Red Hat 3.4.6-3) -- Peter Karman

Re: C90 vs C99

2010-01-14 Thread Peter Karman
Marvin Humphrey wrote on 1/14/10 11:42 AM: On Thu, Jan 14, 2010 at 10:39:20AM -0600, Peter Karman wrote: A web search reveals that this is a GCC error, but it's unexpected because GCC should be tolerant by default. Do the flags that you're passing to GCC include -std=c90, -std=C89, or -ansi

c99

2010-01-14 Thread Peter Karman
/DirManip.c:101: error: ‘DT_DIR’ undeclared (first use in this function) I tried it with KINO_DEBUG=1 as well under CentOS5 (gcc 4.1.2) which includes -std=c89 and it failed with same error. So OS X is happy only with c99 and Linux is happy only without it. thoughts? -- Peter Karman . http

[jira] Created: (LUCY-88) change Charmonizer to use the standard stdint.h integer types

2009-12-11 Thread Peter Karman (JIRA)
Components: Charmonizer Reporter: Peter Karman Charmonizer currently uses non-standard type names for specific integer sizes. It would be more portable and easier for first-timers to grok if we used the standard stdint.h types instead, generating a usable stdint.h on systems where

Re: [Lucy] Re: quiet gcc warnings

2009-12-11 Thread Peter Karman
Peter Karman wrote on 12/10/09 9:04 AM: Marvin Humphrey wrote on 12/9/09 6:15 PM: On Tue, Dec 08, 2009 at 09:24:44PM -0800, Marvin Humphrey wrote: So, instead of u32_t or chy_u32_t, we'd use uint32_t everywhere. We could actually generate a complete and usable stdint.h on systems where

Re: [Lucy] Re: quiet gcc warnings

2009-12-11 Thread Peter Karman
Marvin Humphrey wrote on 12/11/09 9:16 PM: On Fri, Dec 11, 2009 at 08:59:32PM -0600, Peter Karman wrote: well, I've opened the ticket and started poking around but I am not sure where to include stdint.h and/or test for its existence. Is Charmonizer/Probe/Integers.c where I want to make

[jira] Updated: (LUCY-88) change Charmonizer to use the standard stdint.h integer types

2009-12-11 Thread Peter Karman (JIRA)
[ https://issues.apache.org/jira/browse/LUCY-88?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karman updated LUCY-88: - Attachment: stdint_h.patch this patch against r889879 adds stdint.h support change Charmonizer to use

Re: [Lucy] Re: quiet gcc warnings

2009-12-11 Thread Peter Karman
define a 'boolean' (spelled out long to avoid conflict) as a char. I expect others out there have done similar. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Target platforms

2009-11-28 Thread Peter Karman
) has no explicit Win32 support and I don't intend to write any, though I'd be happy to include it if someone contributed it. That said, I'm happy to support Marvin's desire for MSVC a compatibility target (esp since I haven't had to write any relevant code! ;) ). -- Peter Karman . http://peknet.com

Re: [Lucy] Re: quiet gcc warnings

2009-11-28 Thread Peter Karman
Marvin Humphrey wrote on 11/27/09 12:10 PM: On Fri, Nov 27, 2009 at 09:04:27AM -0600, Peter Karman wrote: Not sure that this merits a JIRA ticket, but this little patch quiets a gcc warning: #elif (SIZEOF_PTR == 8) -size_t address = self; +size_t address = (size_t)self; Hmm

cast from pointer to integer of different size

2009-11-27 Thread Peter Karman
, but as Obj_hash_code() is used all over the place, I'm not sure if that's the best solution. Thoughts? -- Peter Karman . http://peknet.com/ . pe...@peknet.com

quiet gcc warnings

2009-11-27 Thread Peter Karman
, Obj_Get_Class_Name(self), address_hi, -- Peter Karman . http://peknet.com/ . pe...@peknet.com

[jira] Commented: (LUCY-72) use short names wherever CHAZ_USE_SHORT_NAMES is in effect

2009-11-26 Thread Peter Karman (JIRA)
[ https://issues.apache.org/jira/browse/LUCY-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782952#action_12782952 ] Peter Karman commented on LUCY-72: -- Ah, this was my fault. I did this: % grep -r -l

[jira] Updated: (LUCY-72) use short names wherever CHAZ_USE_SHORT_NAMES is in effect

2009-11-25 Thread Peter Karman (JIRA)
[ https://issues.apache.org/jira/browse/LUCY-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karman updated LUCY-72: - Attachment: chaz_prefix-cleanup.patch This patch removes the chaz_ prefix wherever the SHORT_NAMES macro

Re: [Lucy] Re: effort

2009-11-22 Thread Peter Karman
on this next, as long as it won't step on the other refactoring going on. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Re: [KinoSearch] Compile 0.30_07 on FreeBSD 7

2009-11-21 Thread Peter Karman
Nathan Kurz wrote on 11/21/09 12:51 PM: 3) Mention that one needs to create an account, and that the 'Create New Issue' link does not appear until you are signed in, and that no amount of searching will help you find it until you have done this. +1 for this. -- Peter Karman . http

effort

2009-11-20 Thread Peter Karman
I have some time. How can I help? -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Re: [Lucy] Re: effort

2009-11-20 Thread Peter Karman
Peter Karman wrote on 11/20/09 10:23 PM: Marvin Humphrey wrote on 11/20/09 10:19 PM: The two tasks I'd suggest are superficial style changes which will serve to familiarize you with most of Charmonizer's modules. Task #1: The core Lucy code base has adopted a naming convention taken

[jira] Created: (LUCY-70) Charmonizer static functions require S_ prefix like the rest of Lucy code

2009-11-20 Thread Peter Karman (JIRA)
: Task Components: Charmonizer Reporter: Peter Karman Priority: Minor The Charmonizer code predates the convention used elsewhere in Lucy of prefixing static functions with S_. -- This message is automatically generated by JIRA. - You can reply to this email to add

[jira] Updated: (LUCY-70) Charmonizer static functions require S_ prefix like the rest of Lucy code

2009-11-20 Thread Peter Karman (JIRA)
[ https://issues.apache.org/jira/browse/LUCY-70?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Karman updated LUCY-70: - Attachment: S_-prefix.patch Patch implements the S_ prefix for Charmonizer code. Charmonizer static

Re: [Lucy] Re: [Lucy] Re: [Lucy] Re: effort

2009-11-20 Thread Peter Karman
lucene.apache.org projects do: almost everything, and everything not originating with a committer that has a CLA (Contributor License Agreement) on file with Apache, goes through JIRA. No worries. When in Rome and all that. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Re: [Lucy] Re: [Lucy] Re: effort

2009-11-20 Thread Peter Karman
Marvin Humphrey wrote on 11/20/09 11:23 PM: On Fri, Nov 20, 2009 at 11:04:28PM -0600, Peter Karman wrote: Patch attached. Groovy. That was fast. :) speed possible only because there were good tests already in place. I sleep better and code faster when there is good test coverage

Re: [Lucy] Re: [KinoSearch] Compile 0.30_07 on FreeBSD 7

2009-11-15 Thread Peter Karman
, the issues do sound complex and I understand why you've done it the way you have. -- Peter Karman . http://peknet.com/ . pe...@peknet.com

Re: [Lucy] Re: [Lucy] Passing strings from Host to Lucy

2009-09-01 Thread Peter Karman
Peter Karman wrote on 09/01/2009 02:29 PM: Marvin Humphrey wrote on 09/01/2009 10:58 AM: Right now, this data structure bears the whimsical name ZombieCharBuf, as in A CharBuf which cannot be Destroyed. ZombieCharBufs are either created on the stack or as compile-time static or global

Re: [Lucy] Planning for the future: Lucy version 2 = Lucy2?

2009-06-14 Thread Peter Karman
packages will detect the existence of previous versions and force you to step through a manual acknowledgment of yes, I know I'm likely breaking something as part of the install. -- Peter Karman . http://peknet.com/ . pe...@peknet.com gpg key: 37D2 DAA6 3A13 D415 4295 3A69 448F E556 374A 34D9

Re: Metadata encoding

2009-03-24 Thread Peter Karman
I advocate that we try to work within JSON's constraints. JSON++ If libswish3 were not already built around libxml2, I'd be using JSON for my config format instead of XML. -- Peter Karman . http://peknet.com/ . pe...@peknet.com