Re: 32-bit APIs with 64-bit data

2010-10-29 Thread Patrick McManus
On Thu, 2010-10-28 at 08:38 -0600, Leif Hedstrom wrote:

 while looking at some issues we're having with InkAPI plugins, the issue 

I saw the message yesterday on the changing of the plugin A{BP}I in a
non-compatible way in order to support 64 bit lengths. (i.e. option 2).

I hope if you do that you also change the ATS major version to 3. People
expect compatibility across minor releases, and rightfully so as that is
the industry convention.

As long as you are making breaking-changes I really hope you'll take it
further than just the 64 bit change.. here are a few suggestions based
on my experience with the current SDK:

#1 - address the utter lack of type safety. e.g.: vconnection callback
functions should not be the same as transaction callback functions.. and
things like INKIOBuffer and INKIOBufferData and INKIOBufferBlock are all
typedef'd to void *. I understand multiplexing void *edata in the
callback function (though I don't think you should do that either), but
essentially saying at the compiler level any pointer to
INKVConnClose() will be fine is just shooting yourself in the foot.
Indeed, this was the source of my post the other day dealing with
HttpTxnIntercept() failing - I had passed a pointer of the wrong type to
some function that could easily have been well-typed and things just
went off the rails from there. In various guises this has been behind
almost all my getting-started frustrations with the plugin
architecture.

#2 - A simple and common operation like get response content-type is
really a small mountain of code. Consider providing easy wrapper
functions for getting and setting of single valued headers (have them
toss errors when they aren't single valued) and response codes. They can
still be zero-copy value/length oriented and reference counted - just
make it opaque.. e.g:

INKHeader headerval;
if (INKGetResponseHeader (headerval, Content-Type) == 0) {
 LocalCode_ProcessContentType(headerval.str, headerval.len);
 INKReleaseResponseHeader(headerval);
}

#3 - INKHttpHdrDestroy(bufp, hdr) - requires the same bufp be passed as
at was used at creation time.. this seems to be an error prone
expectation of caller, just store it in hdr. I think there are other
examples like this..

#4 - most things are INKCamelCase, but some are not: INKmalloc.
Personally, this fills me with doubt every time I start to type INK..
because I cannot remember what the pattern is. There are lots of other
naming inconsistencies - for example INKMBufferCreate() creates an
INKMBuffer, INKHttpParserCreate() creates a INKHttpParser, and
INKHttpHdrCreate() creates a INKHttpHdr^H^H^H^H^H^H^H^H^H INKMLoc. Say
what? All that little stuff just makes the SDK frustrating to use and
violates expectations - and violated expectations lead to bugs
(especially when everything is a void *).

#5 - memory request pools (ala apache http server) would be an obvious
and important addition for plugins to use.

- free this string..

hope this helps.

-Patrick





ts 2.0.0 failed assert `FATAL: CacheDir.cc:693

2010-08-11 Thread Patrick McManus
Hi all -

I'm working on a trafficserver plugin - and if I spin my ~40 case test
suite through TS repeatedly I can pretty easily reproduce the assert
below. (It is not deterministic, sometimes it takes just a couple
iterations, sometimes dozens.)

Ts is 2.0.0 that I compiled myself in debug mode on 64 bit linux.

I've run it through valgrind and haven't seen my plugin doing anything
suspicious - but I cannot reproduce the assert without the plugin
installed. The plugin does change the document size and content.

Unfortuantely, I don't own the copyright on the plugin code, so I cannot
post it in whole.

Anyone have any thoughts on what I might be doing wrong, or if there is
a caching problem related to adaptations that change document sizes?

FATAL: CacheDir.cc:693: failed assert `(unsigned int) dir_approx_size(dir) = 
(unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`
/home/mcmanus/ts/bin/traffic_server - STACK TRACE: 
/home/mcmanus/ts/bin/traffic_server(ink_fatal_va+0xcb)[0x7579bb]
/home/mcmanus/ts/bin/traffic_server(ink_fatal+0xc7)[0x757aa5]
/home/mcmanus/ts/bin/traffic_server(_ink_assert+0xdb)[0x756bff]
/home/mcmanus/ts/bin/traffic_server(_Z13dir_overwriteP7INK_MD5P4PartP3DirS4_b+0x18d)[0x6e8838]
/home/mcmanus/ts/bin/traffic_server(_ZN7CacheVC22openWriteCloseHeadDoneEiP5Event+0x189)[0x70a2d1]
/home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
/home/mcmanus/ts/bin/traffic_server(_ZN12PartCallback12aggWriteDoneEiP5Event+0x137)[0x7055dd]
/home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
/home/mcmanus/ts/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x131)[0x74e66f]
/home/mcmanus/ts/bin/traffic_server(_ZN7EThread7executeEv+0x31e)[0x74eb12]
/home/mcmanus/ts/bin/traffic_server[0x74d1c6]
/lib/libpthread.so.0(+0x69ca)[0x77bc59ca]
/lib/libc.so.6(clone+0x6d)[0x759aa6fd]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x746a3710 (LWP 24261)]
0x758f7a75 in *__GI_raise (sig=value optimized out) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  0x758f7a75 in *__GI_raise (sig=value optimized out) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x758fb5c0 in *__GI_abort () at abort.c:92
#2  0x007578f0 in ink_die_die_die (retval=1) at ink_error.cc:43
#3  0x007579c8 in ink_fatal_va (return_code=1, 
message_format=0x746a29b0 CacheDir.cc:693: failed assert `(unsigned 
int) dir_approx_size(dir) = (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`, 
ap=0x746a28b0)
at ink_error.cc:67
#4  0x00757aa5 in ink_fatal (return_code=1, 
message_format=0x746a29b0 CacheDir.cc:693: failed assert `(unsigned 
int) dir_approx_size(dir) = (unsigned int) (MAX_FRAG_SIZE + sizeofDoc)`) at 
ink_error.cc:75
#5  0x00756bff in _ink_assert (a=0x7ec9e0 (unsigned int) 
dir_approx_size(dir) = (unsigned int) (MAX_FRAG_SIZE + sizeofDoc), f=0x7ec64c 
CacheDir.cc, l=693)
at ink_assert.cc:47
#6  0x006e8838 in dir_overwrite (key=0x1c21fe0, d=0x19d04d0, 
dir=0x1c22010, overwrite=0x1c370b8, must_overwrite=false) at CacheDir.cc:693
#7  0x0070a2d1 in CacheVC::openWriteCloseHeadDone (this=0x1c21f90, 
event=3900, e=0x0) at CacheWrite.cc:1170
#8  0x004f1c22 in Continuation::handleEvent (this=0x1c21f90, 
event=3900, data=0x0) at ../iocore/eventsystem/I_Continuation.h:147
#9  0x007055dd in PartCallback::aggWriteDone (this=0x19cdda0, event=1, 
e=0x1bc95e0) at CacheWrite.cc:327
#10 0x004f1c22 in Continuation::handleEvent (this=0x19cdda0, event=1, 
data=0x1bc95e0) at ../iocore/eventsystem/I_Continuation.h:147
#11 0x0074e66f in EThread::process_event (this=0x74aa8010, 
e=0x1bc95e0, calling_code=1) at UnixEThread.cc:113
#12 0x0074eb12 in EThread::execute (this=0x74aa8010) at 
UnixEThread.cc:219
#13 0x0074d1c6 in spawn_thread_internal (a=0xc090c0) at Thread.cc:85
#14 0x77bc59ca in start_thread (arg=value optimized out) at 
pthread_create.c:300
#15 0x759aa6fd in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#16 0x in ?? ()
(gdb) 




Re: ts 2.0.0 failed assert `FATAL: CacheDir.cc:693

2010-08-11 Thread Patrick McManus
On Wed, 2010-08-11 at 13:27 -0700, John Plevyak wrote:
 I have created a bug:
 
 https://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12471346
 
 
 There is a patch attached.  See if that fixes your problem.
 
 john

John,

A million thank yous for looking at the issue. Definite progress, but it
now assert()s in a different place. Seemingly on the cache read path -


FATAL: CacheRead.cc:981: failed assert `f.single_segment`
/home/mcmanus/ts/bin/traffic_server - STACK TRACE: 
/home/mcmanus/ts/bin/traffic_server(ink_fatal_va+0xcb)[0x7579ef]
/home/mcmanus/ts/bin/traffic_server(ink_fatal+0xc7)[0x757ad9]
/home/mcmanus/ts/bin/traffic_server(_ink_assert+0xdb)[0x756c33]
/home/mcmanus/ts/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x956)[0x7031a0]
/home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
/home/mcmanus/ts/bin/traffic_server(_ZN7CacheVC14handleReadDoneEiP5Event+0x92c)[0x6db656]
/home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
/home/mcmanus/ts/bin/traffic_server(_ZN19AIOCallbackInternal11io_completeEiPv+0x3f)[0x6e2691]
/home/mcmanus/ts/bin/traffic_server(_ZN12Continuation11handleEventEiPv+0x6c)[0x4f1c22]
/home/mcmanus/ts/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x131)[0x74e6a3]
/home/mcmanus/ts/bin/traffic_server(_ZN7EThread7executeEv+0x99)[0x74e8c1]
/home/mcmanus/ts/bin/traffic_server[0x74d1fa]
/lib/libpthread.so.0(+0x69ca)[0x77bc59ca]
/lib/libc.so.6(clone+0x6d)[0x759aa6fd]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x746a3710 (LWP 14525)]
0x758f7a75 in *__GI_raise (sig=value optimized out) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
(gdb) bt
#0  0x758f7a75 in *__GI_raise (sig=value optimized out) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x758fb5c0 in *__GI_abort () at abort.c:92
#2  0x00757924 in ink_die_die_die (retval=1) at ink_error.cc:43
#3  0x007579fc in ink_fatal_va (return_code=1, 
message_format=0x746a2820 CacheRead.cc:981: failed assert 
`f.single_segment`, ap=0x746a2720) at ink_error.cc:67
#4  0x00757ad9 in ink_fatal (return_code=1, 
message_format=0x746a2820 CacheRead.cc:981: failed assert 
`f.single_segment`) at ink_error.cc:75
#5  0x00756c33 in _ink_assert (a=0x7f2a3e f.single_segment, 
f=0x7f226b CacheRead.cc, l=981) at ink_assert.cc:47
#6  0x007031a0 in CacheVC::openReadStartHead (this=0xf6f4a0, 
event=3900, e=0x0) at CacheRead.cc:981
#7  0x004f1c22 in Continuation::handleEvent (this=0xf6f4a0, event=3900, 
data=0x0) at ../iocore/eventsystem/I_Continuation.h:147
#8  0x006db656 in CacheVC::handleReadDone (this=0xf6f4a0, event=3900, 
e=0xf6f618) at Cache.cc:1885
#9  0x004f1c22 in Continuation::handleEvent (this=0xf6f4a0, event=3900, 
data=0xf6f618) at ../iocore/eventsystem/I_Continuation.h:147
#10 0x006e2691 in AIOCallbackInternal::io_complete (this=0xf6f618, 
event=1, data=0xf0db30) at ../../iocore/aio/P_AIO.h:82
#11 0x004f1c22 in Continuation::handleEvent (this=0xf6f618, event=1, 
data=0xf0db30) at ../iocore/eventsystem/I_Continuation.h:147
#12 0x0074e6a3 in EThread::process_event (this=0x74aa8010, 
e=0xf0db30, calling_code=1) at UnixEThread.cc:113
#13 0x0074e8c1 in EThread::execute (this=0x74aa8010) at 
UnixEThread.cc:168
#14 0x0074d1fa in spawn_thread_internal (a=0xc090c0) at Thread.cc:85
#15 0x77bc59ca in start_thread (arg=value optimized out) at 
pthread_create.c:300
#16 0x759aa6fd in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#17 0x in ?? ()
(gdb) 




Re: ts 2.0.0 failed assert `FATAL: CacheDir.cc:693

2010-08-11 Thread Patrick McManus
On Wed, 2010-08-11 at 17:51 -0400, Patrick McManus wrote:
 On Wed, 2010-08-11 at 13:27 -0700, John Plevyak wrote:
  I have created a bug:
  
  https://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12471346
  
  
  There is a patch attached.  See if that fixes your problem.
  
  john
 
 John,
 
 A million thank yous for looking at the issue. Definite progress, but it
 now assert()s in a different place. Seemingly on the cache read path -
 
 
 FATAL: CacheRead.cc:981: failed assert `f.single_segment`

On closer inspection there is a valid path for !f.single_segment - we
just need to update the patch to take that now that we have violated the
assumption.

Does this updated patch make sense to you? It passes my tests with
flying colors..

diff --git a/iocore/cache/CacheRead.cc b/iocore/cache/CacheRead.cc
index 09195d8..3a0bad7 100644
--- a/iocore/cache/CacheRead.cc
+++ b/iocore/cache/CacheRead.cc
@@ -978,10 +978,12 @@ CacheVC::openReadStartHead(int event, Event * e)
 doc_len = alternate.object_size_get();
 if (key == doc-key) {  // is this my data?
   f.single_segment = doc-single_segment();
-  ink_assert(f.single_segment); // otherwise need to read earliest
-  ink_assert(doc-hlen);
-  docpos = sizeofDoc + doc-hlen;
-  next_CacheKey(key, doc-key);
+  if (f.single_segment)
+   {
+ ink_assert(doc-hlen);
+ docpos = sizeofDoc + doc-hlen;
+ next_CacheKey(key, doc-key);
+   }
 } else {
   f.single_segment = false;
 }
diff --git a/iocore/cache/CacheWrite.cc b/iocore/cache/CacheWrite.cc
index a1f529a..8e59db1 100644
--- a/iocore/cache/CacheWrite.cc
+++ b/iocore/cache/CacheWrite.cc
@@ -191,10 +191,14 @@ CacheVC::handleWrite(int event, Event * e)
 vec_len = 0;
   set_agg_write_in_progress();
   POP_HANDLER;
-  agg_len = round_to_approx_size(write_len + vec_len + sizeofDoc);
+  int to_write = write_len + vec_len + sizeofDoc;
+  if (to_write  MAX_FRAG_SIZE) {
+write_len = MAX_FRAG_SIZE - vec_len - sizeofDoc;
+to_write = MAX_FRAG_SIZE;
+  }
+  agg_len = round_to_approx_size(to_write);
   part-agg_todo_size += agg_len;
-  ink_assert(agg_len = AGG_SIZE);
-  bool agg_error = (agg_len  AGG_SIZE ||
+  bool agg_error = (agg_len  AGG_SIZE || write_len  0 ||
 (!f.readers  (part-agg_todo_size  
cache_config_agg_write_backlog + AGG_SIZE)  write_len));
 #ifdef CACHE_AGG_FAIL_RATE
   agg_error = agg_error || ((inku32) mutex-thread_holding-generator.random() 

@@ -211,6 +215,7 @@ CacheVC::handleWrite(int event, Event * e)
 io.aio_result = AIO_SOFT_FAILURE;
 return handleEvent(AIO_EVENT_DONE, 0);
   }
+  ink_assert(agg_len = AGG_SIZE);
   if (f.evac_vector)
 part-agg.push(this);
   else





Re: Deprecated InkAPI / ts.h functions?

2010-06-24 Thread Patrick McManus
Hi All,

On Thu, 2010-06-17 at 20:26 -0600, Leif Hedstrom wrote:

 Anyone else have any thoughts / comments on this topic ? Unless it gets 
 voted down, I'll file a bug and list the APIs that we'd remove from 
 the 2.2 release. Yes, it's semi painful, but I think in the long term, 
 it helps developers not use APIs that we no longer support (and which 
 have better versions anyways).

First, I'm sorry I am a week late to this discussion. I know that's
annoying.

I think the proposed change almost always has negative consequences.
This is a new project and in just its second release this change breaks
compatibility of existing plugins. Over a change in minor version
numbers no less!

TS will get, fairly, a reputation for not having any kind of stable API
or ABI and being a platform with a high maintenance burden to operate. I
know the intention is to be developer friendly, but it really is the
opposite of that for both plugin developer and administrator (who will
find they cannot update their TS because their plugins require porting
to 2.2 with no notable improvement in plugin functionality.).

The 2.0 examples directory contains multiple uses of functions marked
deprecated in the documentation. INKMimeHdrFieldValueInsert() is one
such example. If the illustration code in the 2.0 release cannot abide
by the deprecated warning it makes sense that developers reading that
code would feel free to ignore it as well. If this change is made,
example plugin code that is shipped as part of 2.0 will not run or
compile on 2.2. That breaks reasonable user expectations.

If there were a solid technical reason for removing the functions (i.e.
they cannot be fixed to implement what they are defined to do, or their
presence prevents the implementation of some other function or
improvement) I would feel differently. But I don't see one here.

I certainly have plenty of interfaces of my own that I have designed and
then later wished never saw the light of day. But if they are public
interfaces, then my experience really says it is best to just suck it up
and live with them in perpetuity even if deleting them would be more
convenient for me.

In my opinion, there are several other better ways of improving plugin
design decisions. First, clean up the example code to remove deprecated
code. Second, clean up the documentation so the deprecated functions are
not mixed in with the stuff you want people to use  - exile them to a
deprecated appendix. Third, use compiler attributes to nag people who do
use them anyhow.. Finally, in a major release (i.e. 3.0) perhaps remove
the SDK prototypes for the deprecated functions but preserve the code in
order to maintain a stable ABI for old plugins.

-Patrick




Re: Deprecated InkAPI / ts.h functions?

2010-06-24 Thread Patrick McManus
On Thu, 2010-06-24 at 08:09 -0600, Leif Hedstrom wrote:

 
 How about we do this for 2.2 (this is basically a summary of your ideas):
 

I'm in favor of that plan, thanks for responding.

One more thing to think about when the time comes to break compatibility
in a new major release. Doing so will give people a reason not to
upgrade, which is definitely not what you want. :) That kind of inertia
can be overwhelming - how long did it take for apache server 2.x to
finally supplant 1.x in installations? 1.x had to be maintained with
fixes for years because 2.x wasn't a viable upgrade path for lots of
users.

If it hasn't been a pain in practice to keep the interfaces alive during
2.x then you might just be inventing a headache instead of removing
one. 

In any event, the relative merits of all that can be better balanced
when 3.0 comes along in the future.

-Patrick




[jira] Created: (TS-327) traffic_manager silently fails to start if proxy.config.cluster.ethernet_interface is incorrect

2010-04-28 Thread patrick mcmanus (JIRA)
traffic_manager silently fails to start if   
proxy.config.cluster.ethernet_interface is incorrect
-

 Key: TS-327
 URL: https://issues.apache.org/jira/browse/TS-327
 Project: Traffic Server
  Issue Type: Bug
  Components: Config
Affects Versions: 2.0.0a
 Environment: ubuntu 10.04 (rc) 64 bit
Reporter: patrick mcmanus
Priority: Minor


First ever build from source, 
http://incubator.apache.org/trafficserver/docs/v2/admin/getstart.htm advises me 
to do
./trafficserver start
to start things up.. it failed silently.

Nothing interesting in any of the logs. (nothing at all really, other than 
messages about opening logs).

Turns out only traffic_cop was running, not traffic_manager or traffic_server

Debugger on traffic manager had it exiting Unable to find network interface 
eth0... which is true enough - I use br0 for most things.

Traffic manager noted the problem at level note and then did a full exit.

A full exit should be at least level Error so I could correct the problem.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.