think we
should still do just that: shunt all attention and communication to Lucy, by
listing the Lucy website and mailing lists in the KinoSearch documentation
under SUPPORT.
+1
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 7/25/10 3:07 PM:
peter AT peknet DOT com
yes, please.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
...@incubator [lucy-private]
+1
Volunteers? It hasn't been a heavy burden.
I'm happy to.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 7/23/10 3:27 PM:
On Fri, Jul 23, 2010 at 11:00:58AM -0500, Peter Karman wrote:
those all sound good for Lucy. Should not impede the KS3 release though. I
imagine Lucy1 as an improvement on KS3, inspiring users to migrate.
Forking and releasing KS3 is not a huge
with the proposal as it
written at the moment; it looks like you've already addressed most of the points
above in your edits from last night.
cheers on a hard week's work!
pek
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 07/01/2010 12:33 AM:
Greets,
How does the letter below look?
+1
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
which increases quality of contribution. I'm not sure where to
put those items; community is one possibility.
Looks like you have this under Alignment now?
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
end up being illuminating to
non-users of the language.
Anyway, is it a big deal to add more lists later, based on demand, rather than
splitting them up in the initial proposal?
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
#Sponsor
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 6/26/10 9:45 PM:
On Sat, Jun 26, 2010 at 04:03:25PM -0500, Peter Karman wrote:
One thing that should attract more people is if we can get more than just
Perl
bindings implemented using Lucy. I'm thinking C, PHP, Ruby, Python, etc. To
my
knowledge, none of those
[
https://issues.apache.org/jira/browse/LUCY-114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Karman updated LUCY-114:
--
Attachment: align_signature.patch
The inline patch came out garbled. Same patch attached.
compile
Reporter: Peter Karman
In Clownfish files (.bp) such as KinoSearch/Search/Compiler.bp, certain methods
are defined
as nullable but that nullable attribute is not being propagated to the
_OVERRIDE generated
code.
--
This message is automatically generated by JIRA.
-
You can reply to this email
: Peter Karman
The Build.PL docs claim that --config cc= should work but it does not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[
https://issues.apache.org/jira/browse/LUCY-116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Karman updated LUCY-116:
--
Attachment: get_cc.patch
pass_cc.patch
The attached patches implement the documented
offset was ignored.
* r6079-6081: Fix problem with corruption of the Perl stack by
Host_callback().
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 04/06/2010 01:20 PM:
On Tue, Apr 06, 2010 at 11:56:56AM -0500, Peter Karman wrote:
Is there any legitimate way that $hitdoc-get_score would return a value
that Perl stringifies to 'nan'?
When using a SortSpec that doesn't include scoring, yes. In that case
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Peter Karman wrote on 4/2/10 9:36 PM:
Marvin Humphrey wrote on 3/30/10 2:58 PM:
Wish I had access to a FreeBSD 8.x box so I could figure out what went wrong
here:
http://www.cpantesters.org/cpan/report/7013414
Still broken, same problem:
http://www.cpantesters.org/cpan/report
this way. Sysadmin-y stuff is
a PITA.
I can help in that way though. I'll try and set up some VMs for FreeBSD 8.0 and
OpenSolaris to make it easier to test things.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 3/26/10 4:04 PM:
After the change, it would look like this:
{
entries : [
schema_3.json,
],
segments : [
seg_3
],
format : 1
}
sounds good.
--
Peter Karman . http://peknet.com/ . pe
that.
Does the logging really help the community? Or is it simply handy for your own
memory-jogging? I'm not faulting the latter, just wondering.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
, but within is a little easier to spell and just has a slightly
more natural language linguistic emphasis as opposed to more traditional
noun = value naming style.
I like 'within' -- easy (enough) to remember and type.
changes pushed in r5942.
thanks for the thorough review, Marvin.
--
Peter
Peter Karman wrote on 3/19/10 10:08 PM:
I'm going to dive into the Proximity classes now and see if I can break them.
r5936 implements ProximityQuery for KS.
Marvin, please have a look when you have a chance, and let me know what needs
changing.
In the end it was a one-line difference
Marvin Humphrey wrote on 3/21/10 3:07 PM:
On Sun, Mar 21, 2010 at 02:01:41AM -0500, Peter Karman wrote:
Marvin, please have a look when you have a chance, and let me know what needs
changing.
The current implementation has a limitation I think is probably pretty
important: 'b NEAR
. Wish I had it months ago. :) Now I understand the
function-vs-method capitalization.
thanks, Marvin.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
) and
were what sparked my initial question to the list.
I'm going to dive into the Proximity classes now and see if I can break them.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
the Wildcard
features.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
look at all
those terms in a vector space, most of them will be clustered together, but
one will be way far away.
wouldn't POS tagging achieve the same ends? Or even a look-up lexicon of nouns?
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
reduction.
I've been looking at the SenseClusters package. It's very unfriendly to use, but
the ideas in it are worth some investigation.
http://www.d.umn.edu/~tpederse/senseclusters.html
It uses SVDPACKC to do the big matrix math:
http://netlib.org/svdpack/
--
Peter Karman . http://peknet.com
feature,
specifically the PhraseScorer and the internal winnow_anchors() function. Am I
on the right track here?
[0] I believe Lucene syntax for that query is foo bar~10
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
manage)
versioning.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 3/14/10 7:21 PM:
On Sun, Mar 14, 2010 at 02:30:48PM -0500, Peter Karman wrote:
I like this approach. It's more work for KS developers, in terms of managing
api versions, but that's good work to do, since it requires making the api
changes very explicit.
I understand
Marvin Humphrey wrote on 3/3/10 10:16 AM:
On Tue, Mar 02, 2010 at 11:16:49PM -0600, Peter Karman wrote:
fyi, I've just committed a failing test to KS trunk as r5886. The issue
manifests when an abstract nullable method is overridden in a subclass. The
'nullable' flag is not being parsed
in the grammar passed to Parse::RecDescent in
Clownfish::Parser but that RecDescent stuff is some serious voodoo and I don't
even know where to start.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
a flush threshold, that is often mentioned as a way to control indexing speed
vs. memory use. From what I've seen of KS the indexing speed is much faster and
mem use much lower anyway, so I'm not worrying about it.
Thanks for the detailed reply re: the issues involved.
--
Peter Karman . http
not realized that. This softens my position considerably. I'm
all for making increasing legacy performance so long as it doesn't
complicate the mainline architecture.
++
This is an interesting thread, though I have nothing interesting to add. :)
--
Peter Karman . http://peknet.com
Peter Karman wrote on 1/27/10 10:43 PM:
The OSX behaviour was weird. First time it segfaulted. Ran it again
under gdb and it completed ok. Ran it again without gdb and I got this:
ignore these complaints. seems my os and/or fs was/is seriously fscked.
--
Peter Karman . http://peknet.com
comparing apples with apples. I ran 0.30072 with
fewer sortable fields, since it doesn't support sortable FullTextType, and I ran
svn trunk with all FullTextType fields sortable. So the memory use could be due
to that.
I will rerun more comparable config and report back.
--
Peter Karman
!~ /-march=\w+/ ) {
+$extra_ccflags .= -march=i486 ;
+}
that flag, while necessary on 32bit archs that I've tried, breaks compilation
immediately under 64bit archs with the message:
cc1: CPU you selected does not support x86-64 instruction set
--
Peter Karman . http
to be expected?
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 01/21/2010 11:01 PM:
So that's one item checked off the TODO list.
\o/
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Nathan Kurz wrote on 1/20/10 1:35 AM:
Could you attach your failing standalone test case so I can take a
quick look at it? I tried the inline one above, but saw nothing
strange with GCC 4.2.4 on Slamd64.
http://rectangular.com/pipermail/kinosearch/2010-January/007228.html
--
Peter Karman
://peknet.com/~karpet/char-with-O2.txt
http://peknet.com/~karpet/char-without-O2.txt
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
of.
I'm currently downloading gcc 3.4.6 (the last of the 3.4.x releases
fwiw, from 2006), and will try duplicating the problem on a different
platform with the same compiler version.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 01/19/2010 01:51 PM:
On Tue, Jan 19, 2010 at 01:37:12PM -0600, Peter Karman wrote:
ok - 0 == 0
ok - 1 == 1
ok - 100 == 100
ok - 126 == 126
ok - 127 == 127
ok - 128 == 128
ok - 129 == 129
ok - 250 == 250
ok - 254 == 254
ok - 255 == 255
Whew. If that's
for
awhile and hope that the answer just comes to me in my sleep.
Can we tag out? If you can get me an account on one of these boxen, or if we
can duplicate this behavior on an Amazon EC2 instance, I'd like to take a
crack at it.
I'll work on that and mail you offlist.
--
Peter Karman
but no memory problems %d, i);
+ASSERT_INT_EQ(batch, StrHelp_UTF8_TRAILING[i], 7,
+UTF8_TRAILING bogus but no memory problems %d, i);
}
}
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
KinoSearch/Util/ToolSet.h
#include ctype.h
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
. Which makes me
wonder: what standard are we aiming for, and do we need to be telling the
compiler that?
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--disable-checking --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux
Thread model: posix
gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)
--
Peter Karman
Marvin Humphrey wrote on 1/14/10 11:42 AM:
On Thu, Jan 14, 2010 at 10:39:20AM -0600, Peter Karman wrote:
A web search reveals that this is a GCC error, but it's unexpected because
GCC should be tolerant by default. Do the flags that you're passing to GCC
include -std=c90, -std=C89, or -ansi
/DirManip.c:101: error: ‘DT_DIR’ undeclared (first
use in this function)
I tried it with KINO_DEBUG=1 as well under CentOS5 (gcc 4.1.2) which includes
-std=c89 and it failed with same error.
So OS X is happy only with c99 and Linux is happy only without it.
thoughts?
--
Peter Karman . http
Components: Charmonizer
Reporter: Peter Karman
Charmonizer currently uses non-standard type names for specific integer sizes.
It would be more portable and easier for first-timers to grok if we used the
standard stdint.h types instead, generating a usable stdint.h on systems where
Peter Karman wrote on 12/10/09 9:04 AM:
Marvin Humphrey wrote on 12/9/09 6:15 PM:
On Tue, Dec 08, 2009 at 09:24:44PM -0800, Marvin Humphrey wrote:
So, instead of u32_t or chy_u32_t, we'd use uint32_t everywhere.
We could actually generate a complete and usable stdint.h on systems
where
Marvin Humphrey wrote on 12/11/09 9:16 PM:
On Fri, Dec 11, 2009 at 08:59:32PM -0600, Peter Karman wrote:
well, I've opened the ticket and started poking around but I am not sure
where to include stdint.h and/or test for its existence.
Is Charmonizer/Probe/Integers.c where I want to make
[
https://issues.apache.org/jira/browse/LUCY-88?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Karman updated LUCY-88:
-
Attachment: stdint_h.patch
this patch against r889879 adds stdint.h support
change Charmonizer to use
define a 'boolean' (spelled out long to avoid conflict) as a
char. I expect others out there have done similar.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
) has no
explicit Win32 support and I don't intend to write any, though I'd be happy to
include it if someone contributed it. That said, I'm happy to support Marvin's
desire for MSVC a compatibility target (esp since I haven't had to write any
relevant code! ;) ).
--
Peter Karman . http://peknet.com
Marvin Humphrey wrote on 11/27/09 12:10 PM:
On Fri, Nov 27, 2009 at 09:04:27AM -0600, Peter Karman wrote:
Not sure that this merits a JIRA ticket, but this little patch quiets a gcc
warning:
#elif (SIZEOF_PTR == 8)
-size_t address = self;
+size_t address = (size_t)self;
Hmm
, but as
Obj_hash_code() is used all over the place, I'm not sure if that's the best
solution.
Thoughts?
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
, Obj_Get_Class_Name(self), address_hi,
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
[
https://issues.apache.org/jira/browse/LUCY-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782952#action_12782952
]
Peter Karman commented on LUCY-72:
--
Ah, this was my fault. I did this:
% grep -r -l
[
https://issues.apache.org/jira/browse/LUCY-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Karman updated LUCY-72:
-
Attachment: chaz_prefix-cleanup.patch
This patch removes the chaz_ prefix wherever the SHORT_NAMES macro
on this next, as long as it won't step on the other
refactoring going on.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Nathan Kurz wrote on 11/21/09 12:51 PM:
3) Mention that one needs to create an account, and that the 'Create
New Issue' link does not appear until you are signed in, and that no
amount of searching will help you find it until you have done this.
+1 for this.
--
Peter Karman . http
I have some time. How can I help?
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Peter Karman wrote on 11/20/09 10:23 PM:
Marvin Humphrey wrote on 11/20/09 10:19 PM:
The two tasks I'd suggest are superficial style changes which will serve to
familiarize you with most of Charmonizer's modules.
Task #1:
The core Lucy code base has adopted a naming convention taken
: Task
Components: Charmonizer
Reporter: Peter Karman
Priority: Minor
The Charmonizer code predates the convention used elsewhere in Lucy of
prefixing static functions with S_.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add
[
https://issues.apache.org/jira/browse/LUCY-70?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Karman updated LUCY-70:
-
Attachment: S_-prefix.patch
Patch implements the S_ prefix for Charmonizer code.
Charmonizer static
lucene.apache.org projects do: almost
everything, and everything not originating with a committer that has a CLA
(Contributor License Agreement) on file with Apache, goes through JIRA.
No worries. When in Rome and all that.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Marvin Humphrey wrote on 11/20/09 11:23 PM:
On Fri, Nov 20, 2009 at 11:04:28PM -0600, Peter Karman wrote:
Patch attached.
Groovy. That was fast. :)
speed possible only because there were good tests already in place. I sleep
better and code faster when there is good test coverage
, the issues do sound complex and I understand why you've done it the way you
have.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
Peter Karman wrote on 09/01/2009 02:29 PM:
Marvin Humphrey wrote on 09/01/2009 10:58 AM:
Right now, this data structure bears the whimsical name
ZombieCharBuf, as in
A CharBuf which cannot be Destroyed. ZombieCharBufs are either
created on
the stack or as compile-time static or global
packages will detect the existence of
previous versions and force you to step through a manual acknowledgment of yes,
I know I'm likely breaking something as part of the install.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
gpg key: 37D2 DAA6 3A13 D415 4295 3A69 448F E556 374A 34D9
I advocate that we try to work within JSON's constraints.
JSON++
If libswish3 were not already built around libxml2, I'd be using JSON for my
config format instead of XML.
--
Peter Karman . http://peknet.com/ . pe...@peknet.com
77 matches
Mail list logo