Again, thank you for the digging.  I'd be interested in why ksh's getconf() 
fails as well.

Thanks,
Dan

Sent from my iPhone (typos, autocorrect, and all)

> On May 10, 2017, at 3:44 PM, Ludovic Orban <lor...@bitronix.be> wrote:
> 
> Okay, I found what causes ksh to misbehave. It's in sh_init(), when 
> shgd->lim.child_max is initialized with the results of getconf("CHILD_MAX"), 
> see: https://github.com/att/ast/blob/master/src/cmd/ksh93/sh/init.c#L1289
> 
> I've commented out that line, hardcoded shgd->lim.child_max to 128, rebuilt 
> and voila: ksh works as it should.
> 
> Now I have to dig into that getconf() method to figure out what the returned 
> value is and where it's coming from. Sounds trivial, but my C is *very* 
> rusty, the asm gcc generates doesn't look at all what the JVM's JIT generates 
> (which gives me wrong reflexes as I'm used to the latter) and I'm not very 
> familiar with mdb.
> 
> Oh well, that turned into a nice debugging re-training session which I very 
> much needed. That reminds me the good old days at my first job when I was 
> porting Linux apps to Solaris.
> 
> Thank you for maintaining such a well-designed and pleasant to use OS!
> 
> 
>> On Wed, May 10, 2017 at 3:59 PM, Dan McDonald <dan...@omniti.com> wrote:
>> Wow, thank you for the further deep-diving.
>> 
>> > On May 10, 2017, at 5:21 AM, Ludovic Orban <lor...@bitronix.be> wrote:
>> >
>> > Looking at ksh' sources, my understanding is that job_post is stuck in 
>> > that else clause:
>> >        else
>> >        {
>> >               /* create a new job */
>> >               while((pw->p_job = job_alloc()) < 0)
>> >                      job_wait((pid_t)1);
>> >               pw->p_nxtjob = job.pwlist;
>> >               pw->p_nxtproc = 0;
>> >        }
>> >
>> > Digging into the sources and stepping though the instructions of job_alloc 
>> > and job_byjid it looks like ksh cannot allocate a job id as it believes 
>> > they're all reserved. But so far, all this code is purely working on 
>> > internal structures of ksh so a LX bug would have no impact.
>> >
>> > I'll continue looking into this as time permits and I'll post an update if 
>> > I find anything worth mentioning.
>> >
>> 
>> Be careful of narrowing your focus too far.  I see some things worth 
>> considering:
>> 
>> 1.) If the "if" you're not showing me dependent on something in global state 
>> that may have been mis-initialized by an LX emulation bug?
>> 
>> 2.) Same question as #1, but applied to job_alloc() and job_wait().
>> 
>> I'm guessing LX in OmniOS is failing because I mismerged or plain forgot 
>> something, given that Nahum says he can run ksh93 on SmartOS just fine.
>> 
>> 
>> Please make sure you're looking at the bigger picture, but THANK YOU for the 
>> further investigation.
>> 
>> Dan
>> 
> 
_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss

Reply via email to