Excellent Ryan! Thanks for digging into this.

It looks like a change I made recently, tested in the usual way but on
simpler Linux environments (incl. Travis), has affected this [1]. The
change (to not use "bash -c" when making a subprocess call to launch
workers) may have been working fine for you. But after the change now
doesn't pick up the right python in a shell for your environment. An
alternative method exists for that change, so we'll need to work out
whether that is good to go. I'll discuss this later today with Scott.

Best regards, Richard.

1
https://github.com/rcrowder/nupic/commit/1ee717ee6ed27c65d21a6312089170f170e960d8

On Mon, Nov 2, 2015 at 1:56 AM, Ryan J. McCall <[email protected]>
wrote:

> Aha, I found the issue. The child process (running HypersearchWorker.py)
> was picking up python2.6, which is installed on the machine. There is a
> hard-coded command line statement containing "python" in the
> permutations_runner.py code and when I switched it to "python2.7" it works.
> Here's the line I changed in the current code:
>
>
> https://github.com/numenta/nupic/blob/master/src/nupic/swarming/permutations_runner.py#L676
>
> Is there is a standard way of telling a linux machine which python to use?
> I suppose that would be the best solution. I had made an alias in my bashrc
> to set "python" to version 2.7 but clearly that must not apply to
> subprocesses. If you can't specify this then it seems we want the "python"
> to be configurable, or detectable from the system.
>
> On Sun, Nov 1, 2015 at 2:30 PM, Richard Crowder <[email protected]> wrote:
>
>> "linux2" looks fine for the handlers, where they use startswith("linux").
>> So not likely to be that. Only other think I needed to do was to delete
>> swarming files generated.
>> So out of ideas of how I could get it to work on Windows, and you not :(
>> Unless it's something with different bindings versions or some other
>> Python package. Locally I have nupic 0.3.6.dev0 and nupic.bindings 0.2.2
>> and a variety of other Python packages.
>>
>> Does "import os; print os.pathsep" print a colon? I'm imagining it does..
>> Will try a Ubuntu VM though.
>>
>>
>> On Sun, Nov 1, 2015 at 10:08 PM, Ryan J. McCall <[email protected]>
>> wrote:
>>
>>> Hi Richard,
>>>
>>> Thanks for the reply. I'm not sure what I might change regarding the log
>>> handlers. (I see that there is a default logging conf file that I can
>>> override in my NTA_CONF_PATH.) In my script I'm able to say:
>>>
>>> from nupic.support import initLogging
>>> initLogging()
>>>
>>> and I see a difference in the messages logged to console.
>>>
>>> The swarm-generated files don't seem to be the problem.
>>>
>>> "import sys; print sys.platform.lower()" gives "linux2"
>>>
>>> Best,
>>>
>>> Ryan
>>>
>>> On Sun, Nov 1, 2015 at 3:19 AM, Richard Crowder <[email protected]>
>>> wrote:
>>>
>>>> Hi Ryan,
>>>>
>>>> I've just updated my nupic.core and nupic forks with latest from
>>>> Numenta master. And faced the exact same problem (but on Windows). I needed
>>>> to do two things. Updating sys and file log handlers to support win32
>>>> (src\nupic\support\__init__.py) and to delete files generated during the
>>>> run of the 'simple' swarming test (with one worker, i.e. no --maxWorkers on
>>>> command line). Those changes MAY only be related to the Windows porting,
>>>> but a few things to try..
>>>>
>>>> See what the Python commands "import sys; print sys.platform.lower()"
>>>> outputs.
>>>> Cleaning up files generated by the swarming (for me those files where
>>>> description.py,permutations.py, model_0/ directory, a .pkl and.csv file)
>>>> Using --overwrite flag when swarming with the scripts\run_scripts.py
>>>>
>>>> I'd be interested to see the sys.platform output.
>>>>
>>>> Regards, Richard.
>>>>
>>>>
>>>> On Sun, Nov 1, 2015 at 1:02 AM, Ryan J. McCall <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello NuPIC,
>>>>>
>>>>> I'm having an issue with swarming on a RHEL box. I've installed NuPIC
>>>>> Version: 0.3.1. I have mysql running and have confirmed that db 
>>>>> connections
>>>>> can be made with the test_db.py script. The error I'm getting is similar 
>>>>> to
>>>>> some other threads (traceback below). The hypersearch finishes quickly,
>>>>> evaluates 0 models and throws and exception because there's no result to
>>>>> load. I would appreciate any suggestions. It looks like jobs are added to
>>>>> the DB based on my debugging. My thought is to debug the 
>>>>> HypersearchWorkers
>>>>> next which run as separate processes -- have to figure out how to do 
>>>>> that...
>>>>>
>>>>> Many thanks,
>>>>>
>>>>> Ryan
>>>>>
>>>>>
>>>>> Successfully submitted new HyperSearch job, jobID=1020
>>>>> Evaluated 0 models
>>>>> HyperSearch finished!
>>>>> Worker completion message: None
>>>>>
>>>>> Results from all experiments:
>>>>> ----------------------------------------------------------------
>>>>> Generating experiment files in directory: /tmp/tmp0y39RS...
>>>>> Writing  313 lines...
>>>>> Writing  114 lines...
>>>>> done.
>>>>> None
>>>>> json.loads(jobInfo.results) raised an exception.  Here is some info to
>>>>> help with debugging:
>>>>> jobInfo:  _jobInfoNamedTuple(jobId=1020, client=u'GRP',
>>>>> clientInfo=u'', clientKey=u'', cmdLine=u'$HYPERSEARCH',
>>>>> params=u'{"hsVersion": "v2", "maxModels": null, "persistentJobGUID":
>>>>> "1a3c7950-8032-11e5-8a23-a0d3c1f9d4f4", "useTerminators": false,
>>>>> "description": {"includedFields": [{"fieldName": "time", "fieldType":
>>>>> "datetime"}, {"maxValue": 50000, "fieldName": "volume", "fieldType": 
>>>>> "int",
>>>>> "minValue": 0}], "streamDef": {"info": "rp3_volume", "version": 1,
>>>>> "streams": [{"info": "rp3_volume", "source":
>>>>> "file:///home/rmccall/experiment/projects/rp3/rp3-training_data.csv",
>>>>> "columns": ["*"]}]}, "inferenceType": "TemporalAnomaly", "inferenceArgs":
>>>>> {"predictionSteps": [1], "predictedField": "volume"}, "iterationCount": 
>>>>> -1,
>>>>> "swarmSize": "small"}}',
>>>>> jobHash='\x1a<\x81R\x802\x11\xe5\x8a#\xa0\xd3\xc1\xf9\xd4\xf4',
>>>>> status=u'notStarted', completionReason=None, completionMsg=None,
>>>>> workerCompletionReason=u'success', workerCompletionMsg=None, cancel=0,
>>>>> startTime=None, endTime=None, results=None, engJobType=u'hypersearch',
>>>>> minimumWorkers=1, maximumWorkers=8, priority=0, engAllocateNewWorkers=1,
>>>>> engUntendedDeadWorkers=0, numFailedWorkers=0,
>>>>> lastFailedWorkerErrorMsg=None, engCleaningStatus=u'notdone',
>>>>> genBaseDescription=None, genPermutations=None,
>>>>> engLastUpdateTime=datetime.datetime(2015, 11, 1, 0, 47, 18),
>>>>> engCjmConnId=None, engWorkerState=None, engStatus=None,
>>>>> engModelMilestones=None)
>>>>> jobInfo.results:  None
>>>>> EXCEPTION:  expected string or buffer
>>>>> Traceback (most recent call last):
>>>>>   File "/usr/local/lib/python2.7/pdb.py", line 1314, in main
>>>>>     pdb._runscript(mainpyfile)
>>>>>   File "/usr/local/lib/python2.7/pdb.py", line 1233, in _runscript
>>>>>     self.run(statement)
>>>>>   File "/usr/local/lib/python2.7/bdb.py", line 400, in run
>>>>>     exec cmd in globals, locals
>>>>>   File "<string>", line 1, in <module>
>>>>>   File "htmAnomalyDetection.py", line 2, in <module>
>>>>>     import argparse
>>>>>   File "htmAnomalyDetection.py", line 314, in main
>>>>>     runSwarming(args.nupicDataPath, args.projectName, args.maxWorkers,
>>>>> args.overwrite)
>>>>>   File "htmAnomalyDetection.py", line 164, in runSwarming
>>>>>     "overwrite": overwrite})
>>>>>   File
>>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py",
>>>>> line 277, in runWithConfig
>>>>>     return _runAction(runOptions)
>>>>>   File
>>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py",
>>>>> line 218, in _runAction
>>>>>     returnValue = _runHyperSearch(runOptions)
>>>>>   File
>>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py",
>>>>> line 161, in _runHyperSearch
>>>>>     metricsKeys=search.getDiscoveredMetricsKeys())
>>>>>   File
>>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py",
>>>>> line 826, in generateReport
>>>>>     results = json.loads(jobInfo.results)
>>>>>   File
>>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/object_json.py",
>>>>> line 163, in loads
>>>>>     json.loads(s, object_hook=objectDecoderHook, **kwargs))
>>>>>   File "/usr/local/lib/python2.7/json/__init__.py", line 351, in loads
>>>>>     return cls(encoding=encoding, **kw).decode(s)
>>>>>   File "/usr/local/lib/python2.7/json/decoder.py", line 366, in decode
>>>>>     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>>>>> TypeError: expected string or buffer
>>>>>
>>>>> --
>>>>> Ryan J. McCall
>>>>> ryanjmccall.com
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Ryan J. McCall
>>> ryanjmccall.com
>>>
>>
>>
>
>
> --
> Ryan J. McCall
> ryanjmccall.com
>

Reply via email to