Location: Worcester, MA Duration: 6+ months
Rate: Open Duties and Responsibilities - Maintain IBM Job Scheduler and storage allocation systems and policies to ensure fair allocation of shared resources; - Install, configure, and maintain HPC based Linux systems; - Install and configure custom and GNU-based software on the cluster; - Implement and support performance monitoring and fault monitoring systems; - Propose and create system design models, specifications, diagrams, and charts to provide direction; - Design, implement, and maintain high performance file storage systems; - Integrate servers, including database, backup and compute servers and their associated software into HPC systems; - Support automated tools used to install in-memory, Linux-based installs; - Produce and analyze performance metrics; research and recommend strategies to improve system efficiency and client experience; - Collaborate with peers within and outside the Information Technology and Research communities to ensure the effective use of HPC resources by faculty members and their research teams; - Develop advanced technical training materials and communications to broaden the relevance of services to the community; - Develop and offer onsite, individual and online training workshops in HPC tools, technologies, systems and features. Maintain wiki-based web-accessible FAQs, examples and tutorials, as well as basic, up to date information about systems and services; - Perform other duties as assigned. Skills Needed: - Minimum of five (5) years' experience in High Performance Computing Clusters, GPFS, and LSF Environment; - Domain-specific HPC admin and training skills; - Extensive experience in implementation, installation, configuration, administration & support of Linux servers; - Hands-on experience in job scheduling, submission, optimal resource utilization, reliable workload execution using LSF; - Experience with TCP/IP, internet routing protocols, private and public networks, VLANs, firewalls, load balancers, addressing schemes, subnet creation and subnet masking; - Proven ability to troubleshoot basic network issues and communicate and work with a team of network engineers to solve possible network design issues in HPC; - Experience using and programming automated system management tools, both at a general level (e.g. Puppet) and at a cluster-level (e.g. Rocks); - Working knowledge of EMC storage, Dell hardware and latest technologies; - Experience in IBM clustering technology (High Performance Computing storage); - Experience in GPFS implementations and configuration in HPC environment, including GPFS tuning and performance issues; - Experience in scripting (Shell, Perl, etc.); system and low-level scripting experience required; - Experience in Storage Area Network (SAN), Enterprise Storage and high speed interconnect networks; - Experience in FDR Infiniband networking. Thanks Sandeep Sandeep Jain Software People Inc. [email protected] Ph: 631-863-0299, 631-410-4741, 631-921-2111 C Fax: 631-574-3122 Twitter: Software People @spincjobs Certifications: SBA 8a/SDB, NY MWBE, VA SWaM, DE OMWBE, MA MWBE -- You received this message because you are subscribed to the Google Groups "Exact Match" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/exact-match. For more options, visit https://groups.google.com/d/optout.
